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Tanja Mortelmans, Daniél Van Olmen, Frank Brisard 
Introduction 


The present volume contains a selection of contributions that were written on the 
occasion of Johan van der Auwera's retirement. Its focus is on the study of lin- 
guistic variation, an area on which Johan has had considerable influence. It 
brings together papers dealing not only with cross-linguistic and diachronic var- 
iation but also with intra-linguistic and inter-speaker variation. The phenomena 
that are examined range from negation and tense-aspect-modality over connec- 
tives and the lexicon to definite articles and comparative concepts in well- and 
lesser-known languages. This collection can thus be said to reflect Johan van der 
Auwera’s very broad linguistic interests and to contribute to our understanding 
of variation in general. 

Johan van der Auwera graduated in Germanic Philology (Dutch and English) 
and Philosophy in 1975 at the University of Antwerp. In the same year, he got a 
doctoral scholarship to study at the University of California in Berkeley, where he 
stayed for two years. He then returned to finish his dissertation (on the philoso- 
phy of language: Regaining speculative grammar — Speech acts, logic and focus) 
at the University of Antwerp (1980), with Louis Goossens (professor of English 
Linguistics) as his supervisor. After finishing his dissertation, he spent a few 
months as a guest lecturer at the University of Cologne and was then awarded a 
two-year Postdoctoral Fellowship of the Belgian National Science Foundation, 
which he took up at the University of Antwerp (UIA). From 1983 until 1984, he 
was a Lecturer in Business English at the University of Antwerp (UFSIA), before 
returning to research abroad: in 1984-1985, he worked as an Alexander von Hum- 
boldt Fellow at the University of Hannover (1984-1985) and the Max Planck In- 
stitute for Psycholinguistics in Nijmegen (1985). 

In 1985, at the age of only 32, he obtained a tenured position as a Fellow of 
the Belgian National Science Foundation at the University of Antwerp, where he 
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was affiliated with the Department of Germanic Philology. In 1990, he finished 
his habilitation dissertation simply entitled Coming to terms, which addressed as- 
pects of the structure of the noun phrase. At the time, Johan was mainly expected 
to conduct research. His teaching load at the University of Antwerp (UIA) was 
thus restricted. Still, the courses taught from 1985 onward already reflect the 
width of his research interests on the one hand and their development over the 
years on the other: Johan taught an Introduction to semantics for more than ten 
years, he acquainted students with the structure of Yiddish (1988-1997) and in- 
troduced them to Non-Indo-European linguistics (1985-1989) and the Syntax of the 
relative clause (1986-1987). His course on Semantics and logic disappeared from 
the curriculum after three years (1985-1988) whereas a new course on Typology 
and universals was introduced in 1993. Johan was to teach it until the very end of 
his academic career. 

In 1997, Johan was appointed Professor of English Linguistics at the Univer- 
sity of Antwerp, where he became a Full Professor in 2003. In the last two decades 
of his university career, he taught courses on (various aspects of) English gram- 
mar, varieties of English and English creoles, the development of English, nega- 
tion and linguistic typology, both in the bachelor's and in the master's program. 
The list of his visiting professorships is long: Johan was a visiting professor at the 
CNRS (Paris), Princeton University, the University of Gothenburg, Kyoto Univer- 
sity and Chulalonkorn University (Bangkok). In 2006, he was awarded the pres- 
tigious Francqui Chair (by the Belgian Francqui Foundation), for which he gave 
a number of lectures on typology. He taught master and doctoral classes on vari- 
ous aspects of typology at the Complutense University of Madrid (2008), the Uni- 
versity of Palermo (2009), the Université Libre de Bruxelles (2015) and the Uni- 
versity of Roma III (2015). 

In the very beginning of his career in linguistics, Johan published on classic 
topics within semantics and pragmatics, addressing presuppositions, conversa- 
tional implicatures and indirect speech acts, to name but a few of the terms that 
occur in the titles of his earliest publications. In fact, his very first publication 
(van der Auwera 1975), which appeared as the second issue in the series Antwerp 
Papers in Linguistics, deals with semantic and pragmatic presupposition. It was 
the topic of his licentiate's dissertation, also supervised by Louis Goossens. His 
earlier publications also include work on conditionals, complementizers, relative 
clauses and modality, some of which reveal a rather strong logical orientation. 
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However, in an interview with Jolanta Sinküniené (2016: 295), Johan admits 
to having been “disappointed with this logical approach to language" at a partic- 
ular stage in his career.’ It is “a change of pathway" that has allegedly saved him 
and this change was his interest in typology, an area of linguistics which at that 
time (i.e., in the early nineties) was not yet prominent in Europe. An important 
catalyst of typological studies in Europe was the EUROTYP-project, a large-scale 
research project funded by the European Science Foundation (1990-1995) which 
aimed at examining the range of typological variation found in the languages of 
Europe. Johan participated in this project, working mainly on adverbials. About 
ten years later, he was also one of the 55 authors of The world atlas of language 
structures (Haspelmath et al. 2005), (co)-authoring seven chapters on the expres- 
sion of modality and moods. We can safely say that Johan is one of the leading 
typologists in Europe and perhaps even beyond. 

His many publications (six monographs, over 200 scholarly articles, 23 books 
or special journal issues as editor) testify to his extremely broad linguistic inter- 
ests: they deal with conditionals, concessives, mood(s), modal expressions, 
tense, negation, aspect, indefinites, impersonals, periphrastic do, prefixes, ad- 
verb(ial)s, similatives, relative clauses, the structure of the noun phrase, 
(de)grammaticalization, the status of constructions, semantic maps and so much 
more. In fact, it seems almost impossible to find a topic within the broad domain 
of the semantics of grammar on which Johan has not worked and published. The 
field he has had most impact on is probably (or should we use another marker 
here: undoubtedly?) that of modality. The article he wrote with Vladimir Plung- 
ian on Modality's semantic map (van der Auwera and Plungian 1998) is one of the 
classic and most cited articles in the field, with no less than 986 citations on 
Google Scholar (on March 21, 2018). 

When one looks at Johan's impressive list of publications, a number of things 
strike one's eye - and they reveal something about his person(ality) as well. First, 
we should mention the fact that he has not only published in English and Dutch 
but also in French, German, Croatian, Russian and Chinese. This is in line with 
his huge language expertise: he speaks at least five languages fluently (Dutch, 
English, French, Swedish and German), has a more than average knowledge of 
many other Germanic and Romance languages, reads Russian and has notions of 
Thai and Mandarin. Second, many publications are the result of collaborations 


1 A nice example of this change of heart can be found in the title which he gave to a volume in 
honor of Louis Goossens: English as a human language (van der Auwera, Durieux and Lejeune 
1998) - an implicit reference/reply to the influential work by Richard Montague: English as a 
formal language. 
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with other linguists from all over the world. Throughout the years, Johan has built 
up an enormous international network and has been using it in an extremely gen- 
erous way to connect linguists with each other. Third, Johan does not seem to 
have followed a particular publication strategy. Instead, we find a wide array of 
publication types, from articles in highly acclaimed peer-reviewed international 
journals (e.g., Language) to contributions to more local book volumes and jour- 
nals. 

His academic record is not limited to his publications, of course. His list of 
conference papers is impressive. It starts with a presentation on Presupposition 
and existence given at the University of Louisville in 1976. Over 200 other papers 
were to follow, which Johan presented — in his somewhat idiosyncratic but al- 
ways enthusiastic manner - either alone or (which is considerably more often the 
case) together with other, often young researchers. To illustrate Johan's profes- 
sional zeal and boundless energy, consider the nine papers that he gave in 2004, 
mostly together with other researchers, on a multitude of topics: negative in- 
definites in Flemish (April, Berkeley, USA), nominalization in Tucanoan (May, 
Santa Barbara, USA), the ancestors of the verb need (May, Verona, Italy), inter- 
rogative pro-verbs (August, Nancy, France), English modal verbs (September, 
Pau, France), modal polyfunctionality in Standard Average European (Septem- 
ber, Antwerp, Belgium), imperatives in Slavonic (September, Leuven, Belgium), 
deverbal nouns as questions (October, Lille, France) and Slavic verbs and ad- 
verbs of epistemic possibility (November, Regensburg, Germany). 

Another important aspect of Johan's academic activities concerns his profes- 
sional guidance of younger linguists. Between 2002 and 2016, he supervised ten 
PhD dissertations, which again cover a broad range of topics: from Bultinck's 
(2002) dissertation on the meaning of English cardinals from a Gricean perspec- 
tive over modality in Chinese and English (Li 2003), the diachrony of need (Taey- 
mans 2006), tense, aspect and mood markers in Turkish (Temürcü 2007), inter- 
rogative pronominals (Idiatov 2007), imperatives (Schalley 2008; Van Olmen 
2011), tense and aspect (De Wit 2014) and indefinites (Van Alsenoy 2014) to nega- 
tion (Vossen 2016). 

Apart from the obvious quality and quantity of the research he conducted, 
Johan also excelled in the number of services he has lent the academic commu- 
nity. At the University of Antwerp, Johan co-founded the Centre for Grammar, 
Cognition and Typology in 1999, which he directed until 2014 and which was also 
the home of many international guests. Moreover, Johan was the chair or a mem- 
ber of expert committees for national and international research councils such as 
the European Research Council (2006-2013), the European Science Foundation 
(2005-2010), the Belgian Research Councils (Flemish 2000-2009, French 2010- 
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2015) and the French, Swedish, Danish and Norwegian Research Councils, as well 
as various national research assessment committees. Not least remarkably, since 
2005, Johan has been Editor-in-Chief of the prestigious journal Linguistics. 

Another important chapter in Johan’s professional biography is his work for 
the Societas Linguistica Europaea. He was elected President of the SLE in 2004. 
Seeing how the organization worked, Johan quickly realized there was room for 
improvement. More specifically, he saw the SLE’s potential to grow from a Central 
European old boys network into a genuine European forum for linguistics. The 
SLE benefited immensely from Johan’s excellent international contacts and his 
good relations with many publishers. In 2005, he proposed Teresa Fanego as the 
new Editor-in-Chief of Folia Linguistica. This was the start of a process of modern- 
ization of the SLE, which meant making the organizing more attractive to differ- 
ent generations. Together with people like Christian Lehmann and Anna Siew- 
ierska, Johan prepared real change. In the following years, this new dynamism 
was consolidated with the organization of regular board meetings and the ap- 
pointment of a conference manager (a role first fulfilled by Bert Cornillie, who 
had worked at the University of Antwerp for some years). As a consequence of all 
this, the SLE has evolved into a vibrant linguistic organization with many mem- 
bers, whose yearly conferences are widely attended. The decisions taken under 
Johan’s leadership have greatly contributed to this success. 

It is evident from a look at his prosperous career that the hope which Johan 
expressed in the final words of his first publication has been more than fulfilled: 
“Notwithstanding restrictions we hope that we have not been beating dead 
horses and would be pleased if we have stuck out our neck on a few central is- 
sues.” (van der Auwera 1975: 69). We hope that the present volume testifies to 
Johan’s example and offers some new insights into issues central to one of his 
long-standing interests, i.e., linguistic variation — in the widest sense of the term. 
It includes contributions from some of Johan’s collaborators that have come clos- 
est to his own interests (at one point or another) and whom he evidently respects 
for their work. 

Dagmar Divjak looks at complementation, which has repeatedly come up in 
Johan’s list of publications too (e.g., van der Auwera 1990; Ammann and van der 
Auwera 2004), and examines to what extent a theoretical analysis of Polish com- 
plementing constructions reflects either individual speakers’ or the speech com- 
munity’s knowledge. In tribute to Johan’s extensive work on areal linguistics 
(e.g., van der Auwera 1998a, 2011), Volker Gast and Maria Koptjevskaja-Tamm 
make innovative use of lexical databases to study areality as a factor in lexical 
typology. Martin Haspelmath develops his notion of comparative concepts and 
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elaborates on the differences with descriptive linguistic categories, thus address- 
ing issues raised by van der Auwera and Sahoo (2015) among others. Dmitry Idi- 
atov’s paper is the second one that takes an areal point of view. It focuses on 
clause-final negation in Africa and builds on much of Johan’s recent research 
(e.g., Devos and van der Auwera 2013; van der Auwera and Van Alsenoy 2016). 
Ekkehard Kónig looks at the languages of Europe - a perspective also taken in, 
for instance, van der Auwera (1998b) and Gast and van der Auwera (2011) - and 
describes the formal and functional variation of their definite articles. 

Pierre Larrivée and Adeline Patard explore the diachronic implications of se- 
mantic maps, the theory of which is a recurrent theme in Johan's research (e.g., 
van der Auwera 2008). They also deal with two phenomena that are of centreal 
interest in his career: modality and negative indefinites (e.g., van der Auwera 
2017). Jacques Moeschler's paper can be said to honor especially Johan's earlier 
work on theoretical pragmatics and the logic of language (e.g., van der Auwera 
1985, 1997) and answers the question whether logical connectives are truth-func- 
tional. Vladimir Plungian shares his interest in verbal categories with Johan (e.g., 
van der Auwera and Plungian 1998; Plungian and van der Auwera 2006). His con- 
tribution to the present volume concentrates on verb paradigms in Eastern Arme- 
nian and in particular on hidden semantic distinctions close to (ir)realis and 
(im)perfectivity. Jean-Christophe Verstraete contributes to the literature on epis- 
temic modality (e.g., van der Auwera and Ammann 2005; Vittrant and van der 
Auwera 2010) by identifying an unknown pattern of epistemic marking in the Na- 
tive Australian languages of Cape York Peninsula and explaining its origins. 
Jacqueline Visconti, finally, examines the development of the Italian connective 
anzi ‘on the contrary’ and, more specifically, its procedural meanings, an area of 
semantics that also features prominently in Johan's work (e.g., van der Auwera 
1993; van der Auwera and Coussé 2016). 


Acknowledgement: Thanks are due to Kasper Boye, Cecil Brown, Anne Carlier, 
Ósten Dahl, Liesbeth Degand, Maud Devos, Anaid Donabédian, Caterina Mauri 
and Heiko Narrog for their help with the review process, to the authors for accept- 
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Dagmar Divjak 
Binding scale dynamics 


Fact or fiction? 


Abstract: This paper contributes to current debates in linguistic theory and meth- 
odology by focusing on discreteness versus continuity in linguistic description as 
well as on the importance of structure versus use for understanding mental rep- 
resentations of language phenomena. It does so through a case study on the 
Polish [finite verb + infinitive] construction, henceforth [Vfin Vinf]. Within a Cog- 
nitive Linguistic framework, Divjak (2007) proposed a structurally underpinned 
Binding Scale encompassing eight levels of looser to tighter integration, with 
verbs expressing modality, intention, attempt, result and phase representing the 
most integrated type of [Vfin Vinf] constructions. Cognitive Linguistics aims to 
give a usage-based account of the complex system that language is, grounded in 
general cognitive principles. But at which level of abstraction should we pitch the 
linguistic description of a system such as the [Vfin Vinf] system to find such mo- 
tivating principles at work? In this paper, I assess the distance between usage 
and structure by investigating whether the proposed Binding Scale can be relia- 
bly distinguished in judgments of usage events through statistical unsupervised 
learning. By experimenting with the type of abstraction that needs to be imposed 
on acceptability ratings to arrive at a meaningful classification, conclusions can 
be drawn about the social or mental nature of this structure. 


Keywords: structure, use, discreteness, continuity, cluster analysis, Polish, 
Binding Scale, complementation 


1 The structure versus usage debate 


During most of the 20th century, the classical Saussurean distinction between 
Langue and Parole dominated mainstream linguistic theory. Generativists took 
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the distinction between Langue and Parole on board, accepting there to be struc- 
tural facts and usage facts that are in principle independent of each other and 
can be described in complete isolation from each other. Once performance errors 
are declared irrelevant to competence, it suffices to describe facts about structure 
or competence, to the neglect of use or performance. As an added bonus, allow- 
ing linguists to study an idealized version of language greatly simplified linguis- 
tic analysis. 

Cognitive and functional approaches have been challenging this view for the 
past four decades, stressing the usage-based nature of structure. Within the func- 
tional-cognitive camp, this has led to a focus on usage facts to the extent that now 
structure is largely ignored. A radical usage-based approach would seem to do 
away with the notion of system altogether, indeed (Geeraerts 2010: 258). Yet, “ac- 
counts of language usage, language acquisition and language change are impos- 
sible without an assumption about what it is that is being used, acquired, or sub- 
jected to change. And more moderate functionalists and cognitive functionalists 
recognize both structural facts and usage facts as genuine facts central to the un- 
derstanding of language” (Boye and Engberg-Pedersen 2010: vii). 

Much cognitive and functional writing does not concern itself with charac- 
terizing the precise relationship between usage and structure. Usage is observa- 
ble, but where is the structure? Geeraerts (2010: 237) suggests “a dialectal rela- 
tionship between Structure and Use: individual usage events are realizations of 
an existing systemic structure, but at the same time, it is only through the indi- 
vidual usage events that changes might be introduced into the structure”. Boye 
and Harder (2007: 572) agree that “language is indeed based on actual, attested 
usage, but that it rises above attested instances in providing the speaker not only 
with actual usage tokens but also with a structured potential that is distilled out 
of previous usage”. 

Structure plays no doubt a role in linguistic description and theorizing but 
the question that I want to pose here is whether speakers distil and store structure 
out of use. And if they do, how similar is the structure stored by speakers to the 
structure proposed by linguists? 


2 The role of abstraction in linguistic description 
and representation 


On a methodological level, the discussion about the relationship between struc- 
ture and usage resurfaces as the ongoing debate about the choice for continuity 
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Or discreteness in linguistic analysis (for a first book-length treatment, see Fuchs 
and Victorri 1994). In the following two sections, I will discuss the role of abstrac- 
tion in linguistic description (Section 2.1) and in linguistic representation (Sec- 
tion 2.2). 


2.1 The role of abstraction in linguistic description 


Separating Langue from Parole and declaring the former to be the object of lin- 
guistic study allowed Saussureans to focus on the *neat and tidy" side of linguis- 
tics and to describe language structure independently of language use in terms 
of clean paradigmatic and syntagmatic relations. This discrete frame of descrip- 
tion marginalized phenomena falling outside the realm of such an approach, a 
trend that was further supported by the Chomskyan focus on syntax and prefer- 
ence for algebraic formalizations. 

Nevertheless, there have always been dissidents, denouncing the reduction- 
ism inherent in discrete models. The past few decades have witnessed a surge in 
explicitly continuous models, both for analysis and for representation, couched 
in functionally oriented frameworks. Langacker (2006) remarks that all (linguis- 
tic) models are metaphorical, and all metaphors are potentially misleading. Alt- 
hough, generally speaking, formalists tend towards metaphors involving dis- 
creteness while functionalists favor those based on continuity, even functionalist 
metaphors based on continuity such as the network model have been (rightly) 
criticized for being too discrete. The network model, for example, remains too 
discrete in the identification of sub-meanings and fails to capture the continuous 
dispersal of phenomena (Janda 2009: 111). 

What is it that is discrete or continuous? Is continuity or discreteness a prop- 
erty of a (certain type of) phenomenon (see Fuchs and Victorri 1994 for semantic 
phenomena) or merely a characterization of the model capturing the phenome- 
non? The choice for continuity or discreteness comes into play in all domains of 
linguistic analysis (as well as outside of linguistics) and at multiple levels. 
Whether something is discrete or continuous is subject to construal (Langacker 
2006: 114): a linguistic phenomenon is typically so complex that both discrete 
and continuous descriptions are appropriate, for different aspects of it. Thus, 
even if a phenomenon is gradual in nature, we could well gain insights from 
thinking about it in discrete terms, and vice versa. 

Langacker (2006: 114—126) discusses a variety of ways in which phenomena 
can be viewed discretely or continuously. On the one hand, there are the discreti- 
zation techniques of, first, all-or-nothing responses to gradient input and, second, 
zooming in to yield a higher resolution and see more detail. Discreteness can be 
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imposed through all-or-nothing responses to gradient input since the placement 
of the boundary is arbitrary and implies discontinuity where there is none. An- 
other critical factor for discreteness is specificity, i.e., whether a phenomenon is 
viewed in coarse-grained or fine-grained detail. Something that appears contin- 
uous can be rendered discrete by “zooming in” to examine it at a higher resolu- 
tion, where differences between individual items become visible. 

On the other hand, there are continuity-imposing measures such as schemati- 
zation and summation. Schematization ensures that two experiences become 
equivalent at a certain level, so that comparing them registers identity rather than 
disparity and thus facilitates recognition: if we apprehended everything in full, 
fine-grained detail, we could not build up a coherent view of the world, since 
every experience would be unique. Summation too yields continuous properties. 
Grammaticality judgments, for example, are intrinsically continuous, with devi- 
ance being the cumulative result of multiple factors. It is only when the sum of 
these individual factors passes a certain threshold that a clear-cut judgment of 
ill-formedness emerges. But any particular cut-off point is arbitrary, since the 
judgments are gradient. At the same time, the continuity is derivative rather than 
primitive, since it represents the cumulative result of numerous individual as- 
sessments. 


2.2 The role of abstraction in linguistic representation 


The problem of continuity versus discreteness also poses itself on a representa- 
tional level. What kind of linguistic information is encoded? Structure or usage? 
Rules or facts? Or is the former derived from the latter? 

Since rules are not “given” in the input, if they “exist”, they must be inferred 
from input. If we see syntactic knowledge in terms of rules, we must postulate 
either a rich body of innate linguistic knowledge or a sophisticated grammar in- 
duction device. There are problems with both the generativist approach, postu- 
lating a Universal Grammar, as well as with the emergentist approach, searching 
for a powerful grammar induction device. 

Recently, proposals have been put forward that favour storage of facts, i.e., 
minimally different, partially overlapping exemplars. Researchers disagree as to 
what then happens to these exemplars. Do exemplars remain stored in clouds 
that (have a prototype structure? and) are efficiently searched when activated (cf. 
Bybee 2013) or do such rote-learned formulas form templates that gradually de- 
velop into distinct low-level schemas? In low-level schemas, none of the slots is 
tied to specific lexical items, as a result of storage-efficient data compression in 
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long-term memory (Dabrowska 2000). Unlike the abstract rules of formal linguis- 
tics, usage-based schemas are derived from actual expressions and have the same 
structure as their instantiations. According to Langacker (1991: 133 and else- 
where), the function of higher level schemas in the linguistic system is primarily 
an organizational one. 

Human beings purportedly excel at observing patterns in the speech stream 
(Saffran, Aslin and Newport 1996; Gomez and Gerken 1999) and abstract distri- 
butionally defined categories from input. But does pattern detection (need to) 
yield anything like a linguist’s grammar? Distributional analysis has also proven 
relevant in the context of computational modeling. Redington and Chater (1997, 
1998) show that distributional analysis yields relevant patterns at low and high 
levels of abstraction. Yet, they point out that the study of distributional infor- 
mation and semantics from a psychological perspective is in its infancy (Reding- 
ton and Chater 1998: 183). Although the cognitive system is sensitive to features 
of the input, determining empirically whether infants actually exploit particular 
sources of distributional information to build their grammatical knowledge from 
the ground up remains an open question. This raises the issue of cognitive reality 
for results of distributional linguistic analysis. 

The following survey-based study on Binding Scale dynamics in Polish is a 
case in point. It explores what level of granularity is ideal for describing the Bind- 
ing Scale. What kind of picture emerges at a lower level of abstraction, with more 
detail about variation? Data for this study stems from a large survey of verbs that 
combine with an infinitive in Polish. Before presenting details on the measuring 
instrument (Section 3.1) and the data collection (Section 3.2), I will briefly intro- 
duce the [Vfin Vinf] phenomenon and its relevance to the issues outlined in Sec- 
tions 1 and 2. 


3 The Vfin Vinf system: diagnostics and data 


Polish has more than 20,000 verbs but very few take an infinitive. Culling verbs 
that combine with an infinitive from the 100,000-word corpus-based dictionary 
Inny Stownik (Banko 2000) yielded 95 such verbs (a list is provided in Appendix 
1). Descriptions of the [Vfin Vinf] system are few and far between and this comes 
as no surprise. The [Vfin Vinf] construction is exceptional within any verbal sys- 
tem: usually, one verb is enough to form a full-fledged clause or sentence, as in 
the example I came across a problem. Such events are called simplex events. 
Sometimes, more than one verb will be used in one clause or sentence, as in I 
decided to solve the problem, with the finite verb decided and the infinitive [to] 
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solve. Although less than 1% of all verbs combine with an infinitive, some of the 
members of this category are highly frequent, such as modals or auxiliary verbs. 
Moreover, not all [Vfin Vinf]s are created equal: a distributional analysis shows 
that different finite verbs entertain links of different strength with their infinitives 
(Divjak 2007). In Sections 3.1.1 to 3.1.3, I will describe the set of three diagnostic 
tests that make it possible to differentiate between the different degrees of inte- 
gration between the two verbs in a [Vfin Vinf] construction. 


3.1 Diagnostic tests 


The three diagnostic tests, initially proposed in Divjak (2007) (to which I refer for 
details and references), reveal the degree to which the two verbs or events are 
structurally integrated. They measure the cognitive status of the infinitive clause 
and the degree of integration between finite verb and infinitive by referring to the 
functions verbs typically fulfil. Verbs express events that have participants and 
this is captured in their argument structure. This observation forms the basis for 
the thing-test in Section 3.1.1 and for the that-test in Section 3.1.2. Events also take 
place at a certain moment in time (and space), which forms the verbs’ temporal 
event structure. This is exploited in the time-test in Section 3.1.3. 


3.1.1 The thing-test 


The first diagnostic, the “thing”-test, reveals the conceptual status of the infini- 
tive seen from the point of view of the finite verb. Very briefly, in Cognitive Gram- 
mar, nouns and verbs instantiate diverging kinds of predication (Langacker 1987: 
Ch. 4, 5, 6): verbs represent relational predications whereas nouns represent non- 
relational predications. Furthermore, nouns and verbs differ in terms of the type 
of entities they designate and the sort of scanning required to capture the entities 
they depict. Nouns are symbolic structures whose semantic poles profile things, 
i.e., scenes that are conceived as being unrelated to time and are scanned sum- 
marily, as a whole. Verbs profile processes or series of component states distrib- 
uted through a continuous span of conceived time and are scanned sequentially, 
frame by frame. Infinitives are intermediary between nouns and verbs as they 
profile atemporal relations. Therefore, the conceptualization type typical of the 
(finite) verb can be determined by tracking whether the verb combines with both 
things and relations or only with one of them. 

The question thus becomes: does a specific finite verb need an infinitive or 
can it do with a noun? In (1) and (2), this question is explored with pro-structures, 
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i.e., pro-nouns to refer to things and a pro-verb to refer to actions. If the pro-verb 
do something subsumes under the pro-nominal question something for a particu- 
lar (lemma of a given) verb, then the verb referred to by do something is in essence 
conceptualized as a thing, despite its relational appearance as a verb. 


(0 He planned totravel to Warsaw. 
what? 
to do what? 


(2 He had to travel to Warsaw. 
*what? 
to do what? 


The verb plan from (1) expresses a process, i.e., it is a relational entity, and com- 
bines with infinitives, i.e., entities that, just like processes, have their own rela- 
tional profile, albeit an atemporal relational profile. Yet, the question what (does 
he plan) to do? is not strictly necessary. One could also ask what (does he plan)? 
and receive as response to travel to Warsaw. At a more abstract, non-lexicalized 
level, the action expressed by the infinitive is thus reified, i.e., conceptualized as 
a thing. In other words, the thing-test shows that verbs like plan do not need an- 
other relational profile as offered by the infinitive: the infinitive can be the an- 
swer to a pro-nominal question. Thus, conceptually, plan treats the infinitive as 
any other non-relational entity it combines with. One could say that a verb like 
plan evokes conceptualization of the conceived scene expressed by the infinitive 
like any non-relational thing in that position, more precisely, like a direct object. 
The infinitival relation is thereby presented as a thing, i.e., as an entity that is 
scanned as a unitary whole and is made conceptually subordinate to the process 
expressed by plan. 

The situation is quite different with a finite verb like have in (2), which exem- 
plifies the second scenario. The infinitive that follows this verb cannot be cap- 
tured by the pro-noun what, belonging to the argument structure of the finite 
verb. The question what (did he have) to do? remains required to obtain to travel 
to Warsaw as answer. This indicates that, with certain verbs, the infinitival rela- 
tional profile cannot be backgrounded or made conceptually subordinate to that 
of the finite verb. The finite verb necessarily evokes the idea of another verbal 
relation, albeit an atemporal relation. 
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3.1.2 The that-test 


Apart from differences in the “cognitive status” of the infinitive, [Vfin Vinf] pat- 
terns also differ in how “close” the second verb needs to be to the finite verb. 
Closeness can be judged spatially (i.e., within sentence boundaries) as well as 
temporally and sheds light on the strength and independence of the (finite) verb 
and the event it expresses. 

Closeness within sentence boundaries can be determined by rephrasing the 
infinitive clause as a that-clause. Some verbs that combine with an infinitive are 
restricted to the [Vfin Vinf] pattern while other verbs can link to the second verb 
using a that-construction, without causing the finite verb to change its meaning. 
The verb promise can introduce that-complement clauses and can use these com- 
plement constructions to express the infinitival content alternatively: (3a) can be 
(partially) paraphrased using the pattern of (3b). Unlike promise, try does not oc- 
cur with a that-complement clause at all, as illustrated in (4a) and (4b). 


(3 a. She promised to tell him the truth. 


b. that she would tell him the truth. 
(4) a. She tried to tell him the truth. 
b. *that she would tell him the truth. 


Complementation has been described in terms of conceptual subordination and 
dependence (Langacker 1991: 440-442). Viewing the subordinate clause as a 
main clause participant implies conceptual distancing that encourages summary 
scanning of the component states if not their reification. In other words, constru- 
ing the second verb’s content as a full-fledged complement clause equals impos- 
ing a nominal construal on the second verb and the elements that depend on it 
and detaching that structure conceptually from the finite verb. Compare here 
Wierzbicka's (1988: 132-141) and Givón's (2001: Ch. 12) analysis of that-comple- 
mentation in English. 

Verbs that do not allow that-complementation and are instead restricted to 
combinations with infinitives share morphological and syntactic information and 
strict co-reference rules apply. Such verbs depend to a higher degree on the in- 
finitive than those finite verbs that combine with an infinitive as well as with a 
full-fledged complement clause. Although the latter constructions also consist of 
two events, both events exist to a certain extent independently of one another 
and the infinitive event can be made subordinate to the finite verb event. 
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3.1.3 The time-test 


The (im)possibility of modifying both verbs in a [Vfin Vinf] structure with con- 
flicting time adverbials or adverbial expressions of time shows how the different 
verbs that combine with an infinitive deal with the co-temporality requirement. 
This provides a second measure for the degree of integration between the finite 
verb and the infinitive, a measure that is moreover independent of the verb's ar- 
gument structure and conceptual subordination of one event to the other. 

The verb ask could be used in a construction that locates the finite verb and 
the infinitive in two different and not necessarily tightly sequential moments in 
time. The verb manage demands overlap in or tight sequentiality of time. This 
requirement is illustrated in (5) and (6). 


(5) a. He askedher to buy a ticket. 
b. Yesterday he  askedher to buy a ticket tomorrow. 


(6) a. He managed to buy a ticket. 
b. *Yesterday he managed to buya ticket tomorrow 


Temporal distancing does not imply conceptual subordination. Inserting con- 
flicting temporal specifications is a way to measure the degree of distance or in- 
tegration between the two verbs in [Vfin Vinf] structures, independent from their 
argument structure. The occurrence of temporal distance between two events 
merely entails their conceptual distance. The two events take place at two differ- 
ent moments in time. They are construed as distinct (though related) events 
(Wierzbicka 1975: 497—499; Lakoff and Johnson 1980: 131; Langacker 1991: 299 
fn. 11). 


3.2 Atheoretically supported Binding Scale 


The grammaticality of using each of the verbs that combines with an infinitive in 
each of the three diagnostic tests can be used to build a Binding Scale, a scale of 
looser to tighter integration between two events (see Divjak 2007 for details). A 
binary approach (acceptable versus unacceptable) allows for eight logically pos- 
sible combinations or degrees of integration, as shown in Table 1. Plusses indi- 
cate a positive test score for a test, minuses a negative one. 
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Tab. 1: Binding scale 


1 2 3 4 5 6 7 8 

+ thing + thing + thing + thing - thing - thing - thing - thing 
+ that - that + that - that + that - that + that - that 

+ time + time - time - time + time + time - time - time 
main auxiliary 
verbs verbs 


The eight different logically possible combinations of properties correlate with 
eight different degrees of integration between the two verbs in the [Vfin Vinf] con- 
struction. The categories were ordered according to the thing-test, followed by 
the time-test and, finally, by the that-test. The that-test was considered the link- 
ing diagnostic because it overlaps partially with the thing-test in that it tests for 
the object status of the infinitive structure and partly with the time-test in that it 
tests for separability. 

[Vfin Vinf] combinations on the left-hand side of Table 1 score positively on 
all three diagnostic tests. They show the loosest type of bond and are considered 
multiple, independent events. [Vfin Vinf] combinations on the right-hand side of 
Table 1 score negatively on all three diagnostic tests. These exemplify the tightest 
type of bond and qualify as complex, integrated events. The finite verbs of the 
former combinations are considered standard main verbs while the finite verbs 
in the latter combinations are considered auxiliary verbs, in the most general 
sense of the word. Once the argument structures of each of the verbs is taken into 
account, several semantically coherent subgroups emerge within each category, 
as I demonstrated for Russian (Divjak 2007), which boasts about 300 verbs that 
combine with an infinitive. 

In order to construct a Binding Scale for Polish, data needs to be collected on 
how each of the 95 Polish verbs that combines with an infinitive responds to each 
of the three diagnostic tests. This can be done by relying on one’s intuitions or on 
the intuitions of a number of native speakers. In section 3.3, I will briefly discuss 
the way in which the acceptability of each of the 95 verbs in each of the three 
diagnostic tests was assessed by relying on a large sample of native speakers. In 
Section 4, I move on to finding semantically coherent groups in the data using 
cluster analysis. 
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3.3 Data 


The vast majority of linguistic theories rest on a peculiar type of data: acceptabil- 
ity or grammaticality ratings. Ratings of usage events are proxies: if we accept 
that the system constrains the possibilities, the constructions that are licensed by 
the system should be judged more acceptable than the constructions that are not 
licensed. And more acceptable constructions should be used more frequently 
than constructions that are not licensed. Traditionally, these ratings were ob- 
tained through introspection by the analyst, an approach that is problematic in 
many (if not most) respects. Linguists have addressed (part of) the issue by elic- 
iting ratings from larger numbers of native speakers. 

Data on which to construct the Binding Scale for Polish were gathered in a 
large elicitation survey, following Cowart (1997), in which native speakers of 
Polish rated the acceptability of the 95 Polish verbs that combine with an infini- 
tive in each of the three diagnostic tests that together reveal the degree of verb 
integration between the verbs in the [Vfin Vinf] structure (see Section 3.1). 

Trigger sentences were constructed for each verb*test combination, i.e., all 
95 verbs were used in the three test-constructions, resulting in 285 test sentences. 
To avoid lexical effects, three different examples were constructed per verb*con- 
struction combination. All sentences were adaptations of authentic sentences ex- 
tracted from the Polish National Corpus (non-literary texts) that were comparable 
in complexity and length. 285 participants saw fifteen randomly selected 
verb*construction combinations in which fifteen different verbs were used and 
each of the three test-constructions was presented five times. 

The trigger sentences were hidden among 30 filler sentences that are compa- 
rable in complexity and length and likewise exhibited grammaticality levels 
ranging from -2 to +2, as judged by native speakers. Both triggers and fillers were 
randomly assigned to blocks (to avoid order effects) that each contained one ex- 
ample of each construction type (three triggers) and one example of each mistake 
level (five fillers). These eight sentences were randomized within blocks, i.e., they 
were pseudo-randomized to ensure no questionnaire started with a trigger and 
triggers never followed each other. For an example, see Appendix 2. 

Surveys of one page and a half were filled out in class by undergraduate stu- 
dents of English or German in Poland. Participants were asked to “tell me how 
Polish this sentence sounds” and their answers were recorded on a five-point Lik- 
ert scale (-2 to +2 and ?). On this scale, they were told, -2 stands for unnatural 
Polish, i.e., a sentence that sounds strange and may even be difficult to under- 
stand. The middle value, 0, signaled *OK" Polish or sentences a native speaker 
could produce, although they are not perfect (this accommodates the strong pre- 
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scriptive tradition concerning the regulation and teaching of Polish to which par- 
ticipants would have been exposed). Finally, +2 was reserved for natural Polish 
sentences that are fully normal and understandable. Participants were ensured 
there were no right or wrong answers. 


4 Finding groups in the data 


Structure is an abstraction over usage data, yet very little is known about the 
amount of variation that is discarded in traditional linguistic analyses. In this 
section, I will use exploratory statistical techniques to detect natural groupings 
in the data and compare those to the eight degrees of integration that together 
make up the Binding Scale presented in Section 3.2. 

The acceptability ratings were subjected to cluster analysis, an unsupervised 
learning technique that detects structure in data (see Baayen 2008; Johnson 
2008; Gries 2009; Divjak and Fieller 2014; Levshina 2015). Cluster analysis is an 
exploratory data analysis technique, encompassing a number of different algo- 
rithms and methods for sorting different objects into groups. It requires the ana- 
lyst to make choices about dissimilarity measures and grouping algorithms. Yet, 
in contrast to many other statistical methods, there seem to be fewer diagnostics 
informing of the weaknesses of any classification solution proposed. Therefore, 
*look[ing] for cluster groupings that agree with existing or expected structures" 
and “pick[ing] the one solution you like best” are not frivolous comments in the 
context of cluster analysis (Divjak and Fieller 2014: 430). Here, I will try a number 
of different dissimilarity measures and grouping algorithms to see whether any 
one combination can identify clusters that correspond to the eight degrees of in- 
tegration from the Binding Scale discussed in Section 3.2. 

The nature of the Likert scale used to collect grammaticality judgments poses 
a challenge in this respect. Whether the Likert scale is an ordinal or an interval 
scale is the subject of much debate. Although Likert himself assumed that the 
scale has interval qualities, as it was originally intended as a summated scale (af- 
ter the questionnaire is completed, item responses are summed to create a score 
for a group of items), some consider a Likert scale to be ordinal in nature. Hence, 
treating the data as interval, or even ratio, is doubtful: summing ordinal data will 
not make it interval data, it will only make it summated ordinal data. The problem 
is compounded if only five levels of (dis)agreement are used, since respondents 
will not perceive all pairs of adjacent levels as equidistant. It has been objected, 
however, that, if the wording of response levels implies symmetry of response 
levels around a middle category, measurements would fall between ordinal and 
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interval level. To treat such data as ordinal could mean ignoring information it 
may contain. Furthermore, accompanying the item-to-be-rated with a visual an- 
alog scale where equal spacing of response levels is clearly indicated has been 
said to increase the likelihood that respondents construe the points as equidis- 
tant. Although both requirements were met in the questionnaires used, I remain 
doubtful as to whether the data could be considered anything but ordinal. 

Since few clustering techniques deal with ordinal data, several work-arounds 
are explored, i.e., clustering summated responses (Section 4.1) and clustering 
summated proportions of responses (Section 4.2). Although the assumption that 
speakers have had less exposure to constructions they consider bad and are less 
likely to use such constructions themselves underlies both types of data summar- 
ies, there is a qualitative difference between these two approaches. Similarity in 
summated proportions of respondents assigning a particular score are slightly 
more precise in that they keep variation in the data, while similarities between 
summated responses may gloss over the very different combinations of judg- 
ments they are made up of. For example, asummed score of 10 might be the result 
of five respondents assigning the test construction a marginally unacceptable 
score or from two respondents considering the construction perfect and three oth- 
ers considering the construction unacceptable. 


4.1 Cluster analysis on summated responses 


For a first series of analyses, the fifteen ratings per verb*construction combina- 
tion were summed up. Responses to several Likert questions can be summed, pro- 
vided that all questions use the same Likert scale and that the scale is a defenda- 
ble approximation to an interval scale, in which case they may be treated as 
interval data measuring a latent variable. 

The data was then taken through hierarchical agglomerative cluster analysis, 
using agnes() from the package cluster in R, with Euclidean as the distance meas- 
ure and Ward’s as the amalgamation algorithm. Euclidean measures the distance 
between items “as the crow flies” and Ward’s is known to yield small groups. The 
combination of both has proven to work well for linguistic data. The results are 
presented in the dendrogram in Figure 1. The dendrogram is read bottom up, with 
lower clusters representing items that are very similar and hence end up being 
clustered first. These lower-level clusters are then in turn grouped to form higher- 
level clusters and this process is repeated until all clusters are united in one over- 
arching cluster. 
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Dendrogram of agnes(x = euclidean, method = "ward") 
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Fig. 1: Dendrogram of HAC cluster analysis on summated data with Euclidean as distance 
measure and Ward's as amalgamation algorithm 
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The agglomerative coefficient (AC), indicated at the bottom of the plot, is a meas- 
ure of the clustering structure of the dataset that ranges from O to 1. An AC close 
to 1 indicates that a very clear structuring has been found whereas an AC close to 
0 indicates that the algorithm has not found a natural structure. Do bear in mind 
that this measure is sensitive to sample size, i.e., the value goes up as the number 
of observations grows. In the present analysis, the AC for the dendogram is very 
high (0.96) and this supports the presence of natural varieties (despite the indi- 
cator's sensitivity to the sample size). 

Given the large number of clusters distinguished, a non-hierarchical cluster 
analysis was carried out to find the optimal clustering. This was done with pam() 
from the package cluster in R, using the same Euclidean distance measure. Sil- 
houette plots were used to compare clustering solutions. These plots are read 
from left to right, and each silhouette represents one cluster. 
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Fig. 2: Average silhouette width for seven-cluster solution 
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The more the silhouette shape resembles a rectangle, the higher the similarity of 
the elements in the cluster. The similarity is also expressed quantitatively by 
means of a silhouette value, which measures the degree of confidence in the clus- 
tering assignment of an observation. Well-clustered observations that are very 
distant from neighbouring clusters have values near 1, while poorly clustered ob- 
servations that are probably assigned to the wrong cluster have values near -1. 
The average silhouette width is the average of the silhouette widths for all objects 
in the whole dataset and indicates the goodness of the overall clustering. Com- 
paring average widths across clusterings reveals the best cluster solution. The 
optimal clustering solution for the data appeared to contain seven clusters, which 
is shown in the silhouette plot in Figure 2. Yet, each of the clusters has a relatively 
low silhouette width (ranging from 0.22 to 0.39) and the Average Silhouette Width 
for the optimal seven-cluster solution remains as low as 0.31, indicating that the 
proposed clustering may not be sensible. 

This conclusion is confirmed by looking at the contents of each cluster. For 
each of the seven clusters a medoid is identified. A medoid is the most centrally 
located point in the given data set, representative of a data set in the sense that 
its average dissimilarity to all the objects in the cluster is minimal. The medoids 
are listed in Table 2. As mentioned in Section 3.2, the verbs in a cluster are ex- 
pected to resemble each other semantically. The medoids do not show a strong 
semantic resemblance to the other verbs that are part of the same cluster, unfor- 
tunately. Table 3 contains details on one of the clusters listed in Table 2, i.e., the 
one for which the medoid is bać sie ‘be afraid of, fear’ (the complete contents of 
each of the seven clusters is listed in Appendix 3). Apart from one verb, (za)wahać 
sie ‘hesitate, waver’, all other verbs express rather the opposite of fear. There is 
some semantic cohesion between other verbs that are part of this cluster, how- 
ever. 


Tab. 2: Medoids for a non-hierarchical cluster analysis requesting 7 clusters 


Cluster  Medoid Translation 

1 bać się_ ‘be afraid of, fear’ 

2 śpieszyć _pośpieszyć ‘hurry, be in a hurry’ 

3 zobowiązywać sie zobowiqzaé sie *bind, pledge oneself 

4 uwielbiać_uwielbić ‘adore, worship” 

5 kończyć_skończyć ‘end, finish, conclude, close’ 
6 uczyć_nauczyć ‘teach, instruct’ 

7 potrafić_potrafić ‘know how to, manage’ 
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Tab. 3: Contents of one cluster resulting from non-hierarchical cluster analysis requesting 


seven clusters 


Verb 


decydowaé sie zdecydowaé sie 
pozwalać_pozwolić 

zgadzać sie_zgodzić sie 
proponować_zaproponować 
bać sie_ 

godzić sie_ 

zalecać _zalecić 
przykazywać_przykazać 
bronić_ 

namawiać_namówić 
zamierzać_zamierzyć 
zezwalać_zezwolić 
dopomagać_dopomóc 
wahać się_zawahać się 


zakazywać_zakazać 


Translation 


‘determine, decide’ 

‘allow, permit, let’ 

‘agree, concur, consent’ 
‘offer, propose’ 

‘be afraid of, fear’ 

‘agree, consent’ 
‘recommend, commend’ 
‘order, command?’ 

‘defend, guard, vindicate, assert’ 
‘induce, persuade’ 
‘intend, mean, be going to’ 
‘allow, permit, let’ 

‘help, aid, assist’ 
‘hesitate, waver’ 

‘forbid, prohibit’ 


The shape of the clusters in Figure 2 and the low average silhouette width confirm 
that there is no clear structure. Instead, many verbs are close to verbs from other 
clusters. The fact that the structure found may be artificial would explain why the 
overarching semantics of individual clusters is difficult to capture. 


4.2 Clustering summated proportions of responses 


Instead of summing all judgments provided for one sentence, we could also sum- 
marize the data by proportions of respondents who assign a particular score. 
Summarizing by proportions of responses was done in two different ways, using 
the original five-point scale and a condensed three-point scale.’ 


1 Due to the instructions accompanying the rating scale, i.e., the fact that the middle point was 
conceived as O to capture the judgment “could be heard”, creating a binary solution would re- 
quire second-guessing respondents' intentions for assigning a O as it could mean "could be 
heard but I consider it unacceptable" or “could be heard and I consider it acceptable". 
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4.2.1 Using a five-point rating scale 


In a first analysis, proportions of responses were calculated using the original 
five-point ratings scale. Eight analyses were run, with both Euclidean and Man- 
hattan distance in combination with complete, single, average linkage and 
Ward's amalgamation algorithms. Because both distance measures yielded virtu- 
ally identical results, I will only present one set here. 

The highest agglomerative coefficient was achieved by the Manhattan/Ward 
combination (0.87), followed by Manhattan/Complete (0.72), Manhattan/Aver- 
age (0.54) and Manhattan/Single (0.29). To assess the replicability of the cluster- 
ing, in the absence of an independent test-sample, p-values for all clusters con- 
tained in the clustering of the original data were calculated using the R package 
pvclust. For each cluster in hierarchical clustering, p-values are calculated via 
multiscale bootstrap resampling, a computer-based way of simulating similar da- 
tasets. Pvclust provides two types of p-values: the AU (Approximately Unbiased) 
p-value (on the left, normally in red) and BP (Bootstrap Probability) value (on the 
right, normally in green). The AU p-value, which is computed by multiscale boot- 
strap resampling, is a better approximation to unbiased p-value than the BP value 
computed by normal bootstrap resampling. Clusters that are highly supported by 
the data will have large p-values. 

The two clusterings with the clearest structure as per the Agglomerative Co- 
efficient do not yield any high-level replicable clusters. Based on 100 replica- 
tions, the Manhattan/Ward combination yields nine clusters, each containing be- 
tween two and six verbs, with AU values above 95. The likelihood that these 
clusters would not be found in another dataset is thus rejected at significance 
level 0.05. These clusters appear in (red) rectangles in Figure 3. All clusters are 
lower-level groupings; no higher-level clusters are likely to be found in other da- 
tasets, as the zeroes indicate. Of the lower-level groupings, only the six-verb clus- 
ter (second from the right) is semantically coherent, containing verbs like *prom- 
ise’ or ‘advise’. Manhattan/Complete yields a similar picture: eight replicable 
clusters with between two and four verbs each. 
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Fig. 3: Dendrogram of HAC with Manhattan/Ward and p-values on five-point scale 
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In other words, working with five levels of acceptability results in many low- 
level clusters. It is unclear from the data, however, what would motivate these 
clusters. If linguists would like to prefer low-level generalizations over high-level 
ones, some form of similarity between the verbs in one cluster would be expected. 
Dabrowska (2008), for example, found that speakers prefer low-level generaliza- 
tions over clusters of phonologically similar forms or clusters of words sharing 
the same derivational affix to more global generalizations. The clusters do, how- 
ever, not contain verbs resembling each other from a semantic point of view and 
there is no phonological or morphological similarity either. It is rare to find a clus- 
ter containing infinitives ending in the same suffix, having a reflexive pronoun 
or exhibiting the same morphological aspectual alternation pattern. 


4.2.2 Using a three-point rating scale 


Clusters containing only two to four verbs contribute little to our understanding 
of the category of [Vfin Vinf] verbs as a whole. Therefore, in a next step, the five 
scoring options were reduced to three, by collapsing the scores -2 and -1 as well 
as 1 and 2. The same eight analyses as described in Section 4.2.1 were run, four 
with the Euclidean distance measure and four with Manhattan. For both sets, the 
agglomerative coefficients are the same depending on the amalgamation strategy 
used. Ward's does best, while Single linkage performs most poorly. 

Of the clusterings run with the Euclidean distance measure, Ward-based 
clusterings achieve an agglomerative coefficient over 0.90 (both Euclidean/Ward 
and Manhattan/Ward get 0.93) while Complete-based clusterings receive an ag- 
glomerative coefficient over 0.80 (Manhattan/Complete gets 0.83 and Euclid- 
ean/Complete gets 0.82). Manhattan/Average gets 0.69 and Euclidean/Average 
0.68 while Euclidean/Single gets 0.41 and Manhattan/Single 0.39. 

These analyses were followed up with pvclust, to determine which clusters 
could be expected to replicate. Using pvclust with 1000 repetitions to assess the 
uncertainty in the Euclidean/Ward hierarchical cluster analysis, the two over- 
arching groups that are amalgamated last both receive AU (approximately unbi- 
ased) p-values of 99. In other words, the hypothesis that these clusters do not 
exist is rejected at significance level 0.01. 
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Fig. 4: Dendrogram of HAC with Manhattan/Ward and p-values on three-point scale 
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The highlighted clusters in Figure 4, one on the left-hand side containing 22 verbs 
and the other one containing all remaining verbs, do not only seem to exist be- 
cause of sampling error but may be stably observed if we increase the number of 
observations. The second-best clustering (running on Euclidean/Complete, not 
pictured here) suggests different clusters would replicate. The same high-level 
cluster of 22 verbs emerges but it is complemented by a medium-level seven-verb 
cluster expressing attitudes such as ‘like’ or ‘detest’, as well as by fifteen low- 
level clusters containing between two and four verbs each. These smaller clusters 
remain semantically unmotivated. 

The two clusters in Figure 4 that are amalgamated last are of most interest 
from the point of view of the Binding Scale introduced in Section 3.2. It is also 
important that the leftmost cluster falls out of the second-best clustering as well. 
The two high-level clusters correspond to what I earlier called main verbs and 
auxiliary verbs respectively. The leftmost cluster contains the so-called auxiliary 
verbs whereas the rightmost cluster contains all the other verbs. In other words, 
auxiliary verbs behave differently enough from all other verbs to be rated in such 
a way by naive speakers that they are picked up by a clustering program. The 
verbs listed in Table 4 qualify as auxiliary verbs. This diverse group of so-called 
auxiliary verbs is consistent with the results for English (Givón 2001: 54-58) and 
Russian (Divjak 2007), where semantic clusters of verbs expressing modality, in- 
tention, attempt, result and phase are attested within the category of auxiliary 
verbs. Comparable findings have been reported for non-Indo-European language 
systems, which may use verbal affixes, modifiers to a verb (including both ad- 
verbs and modal verbs) and non-inflecting particles within a clause to express 
similar concepts (Dixon 1996: 178). 


Tab. 4: Replicating cluster of verbs with Manhattan/Ward on three-point scale 


Verb Translation Classification 
_zdołać ‘be able’ result 
_zechcieć ‘become willing’ volition 

dawać sie daé sie ‘be possible, allow itself modality 
dokańczać_dokończyć ‘finish up, conclude’ phase 
kończyć_skończyć ‘end, finish, conclude, close’ phase 
kontynuować_ ‘continue’ phase 

kusié sie skusic sie ‘seek to obtain, attempt’ attempt 
mieć_ ‘have to’ modality 


móc. *can, be able' modality 
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Verb Translation Classification 
musieć_ ‘be obliged to, have to’ modality 
poczynać_począć ‘begin, originate’ phase 
przestawać_przestac ‘cease, stop, discontinue’ phase 
raczyć_raczyć ‘deign, condescend’ result 
rozpoczynaé rozpoczac ‘begin, start, commence’ phase 
silić sie_ ‘make efforts, exert oneself’ attempt 
smieć_ ‘dare, venture’ NA 
usitowaé . *make efforts, endeavor, at- attempt 
tempt’ 
wzbraniaé sie wzbronié sie ‘forbid’ NA 
zaczynać_zacząć ‘begin, start, commence’ phase 
zamyślać_zamyślić ‘design’ volition 
zdążać_zdążyć ‘manage to do on time’ result 
żenować się_ ‘feel embarrassed’ NA 


5 Is there a system in the variation? 


It has been claimed that language is a social fact, an observable regularity in lan- 
guage use realized by a specific community. But it is also a cognitive fact because 
the members of the community have an internal representation of the existing 
regularities that allows them to realize the same system in their own use of the 
language (Geeraerts 2010: 237-238). In the case of the [Vfin Vinf] constructions 
discussed in this paper, would the proposed Binding Scale fall out of a social in- 
terpretation of acceptability ratings for the diagnostics that motivate the system? 
And how much of any Binding Scale would speakers need to have internalized to 
yield judgments that would seem to support the abstract system? 

The one clear result that emerged from a series of cluster analyses supports a 
bifurcation of [Vfin Vinf] constructions into those built on a finite verb that is a 
main verb and those built on a finite verb that is an auxiliary verb. Small low- 
level classes exist but it is unlikely that there would be any widely shared local 
prototypes given that those lower-level classes did not exhibit any phonological, 
morphological or semantic coherence, which would be required to elevate the 
verb*construction combination from lexical idiosyncrasy to lower-level schema. 
Individual local prototypes may, however, have guided the ratings for individual 
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respondents and any divergence between these local prototypes may have fur- 
ther increased the variability in the data. The cline of eight different degrees of 
integration between the events expressed by means of a [Vfin Vinf] construction 
could not be reconstructed from acceptability ratings, when submitted to a 
(standard) statistical technique designed to find groups in data. 

The observed two-way classification fell out from data summarized as the 
proportion of respondents who assigned a score on a three-point scale, i.e., itis a 
social construct and the result of summation and schematization. Summing the 
number of individuals who assigned a particular rating registered tendencies 
within the group of respondents. The scales had to tip for a (more) clear-cut judg- 
ment of ill-formedness to emerge. This process was facilitated by schematization: 
reducing the five-point scale to a three-point scale ensured that two experiences 
had a better chance of becoming equivalent, so that comparing them registered 
identity rather than disparity, thereby facilitating categorization. 

The Binding Scale, like any other linguistic classification, abstracts away 
from variation to reveal the skeleton of a system that, if built on well-motivated 
diagnostic principles, should apply to a number of languages. For this study, us- 
age data was used to populate the cells. A sufficient number of speakers of Polish 
recognized the syntactic limitations on auxiliary verbs for them to emerge as a 
category at the social level. The sample of speakers that I polled appears to have 
a strong aversion towards using auxiliary verbs in any other constructions than 
[Vfin Vinf]. At the same time, speakers diverged in their assessment of the extent 
to which the three diagnostic constructions are felicitous for main verbs. Because 
ofthe variation in their judgments, no crisply delineated categories of main verbs 
arise at the participants' group (i.e., social) level. This may mean that the finer 
details of the classification are not mentally real for any speakers, or maybe only 
for a small subgroup. 

In this case, the Binding Scale could be partly reconstructed on the basis of 
acceptability data on the diagnostics but only if that data is summarized so as to 
reveal its social basis. The cluster analyses suggest that the Binding Scale cap- 
tured conventionalization in society, not entrenchment in the mind. Language is 
very likely a complex adaptive system (Beckner et al. 2009) in which knowledge 
of the system's individual parts does not imply understanding of the system. The 
local agents or speakers know their task but the teleology of the system remains 
out of their grasp - if there is a goal to the overarching system at all. Knowledge 
is socially distributed: while each speaker individually knows part(s) of the sys- 
tem, no one speaker knows them all. By putting this distributed knowledge to- 
gether, a picture of a socially supported system emerges, that in its entirety is 
unlikely mentally real for any one agent. 
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These findings limit what usage-based linguists, working within a cognitive 
framework, can expect from theoretical models that are not built on usage data 
from a large number of speakers but on binary acceptability judgments from an 
individual. Even if a proposed account is theoretically justified and each diagnos- 
tic has a plausible cognitive explanation, the overarching model may well lack 
psychological reality for other speakers of the language. 


Acknowledgement: I am grateful to Neil Bermel, Petar Milin, James Street and two 
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Appendix 1: List of verbs that combine with an 
infinitive 


Verb (imperfective perfective aspect) Translation 
1 zezwalać_zezwolić ‘allow, permit, let’ 
2 brzydzić sie_ ‘abhor, loathe, have an aversion’ 


3 przyrzekać_przyrzec ‘promise’ 


Verb (imperfective_perfective aspect) 


kochaé_ 

wzbraniaé sie wzbronié sie 
ośmielić się_ośmielić sie 
zamyślać_zamyślić 

obawiać się_ 

umieć_ 

starać się_postarać się 
decydować się_zdecydować się 
dawać się_dać się 
pozwalać_pozwolić 
przyzwyczajać się_przyzwyczaić się 
poczynać_począć 

zabraniać zabronić 

życzyć [sobie]_zażyczyć [sobie] 
kazać_kazać 
proponować_zaproponować 
zakazywać_zakazać 

móc_ 

poważać się_poważyć się 
nawykać _nawyknąć 
pomagać_pomóc 
przysięgać_przysiąc 
próbować_spróbować 
radzić_poradzić 
dokańczać_dokończyć 
Slubowaé Slubowac 

uczyé sie nauczyé sie 
śpieszyć _pośpieszyć 
ubóstwiać_ 

woleć_ 

kończyć_skończyć 

_zechcieć 

godzić się_ 

nienawidzić_ 

pamiętać_ 


obiecywać [sobie] obiecaé [sobie] 
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Translation 


‘love’ 

‘forbid’ 

‘venture, dare’ 

‘design’ 

‘fear, be afraid, be anxious’ 
‘know how, be able’ 
‘endeavor, make efforts, take pain, try’ 
‘determine, decide’ 

‘let, allow’ 

‘allow, permit, let’ 

‘become accustomed, get used’ 
‘begin, originate’ 

‘forbid, prohibit, interdict’ 
‘wish, desire’ 

‘bid, order, let’ 

‘offer, propose’ 

‘forbid, prohibit’ 

‘can, be able’ 

‘dare’ 

‘become accustomed’ 
‘help, aid, assist’ 

‘swear’ 

‘try, test, attempt’ 

‘advise’ 

‘finish up, conclude’ 

‘vow, make a vow’ 

‘learn’ 

‘hurry, be in a hurry’ 
‘idolize, adore’ 

‘prefer’ 

‘end, finish, conclude, close’ 
‘become willing’ 

‘agree, consent’ 

‘hate, detest’ 

‘remember, keep in mind’ 


‘promise’ 


36 —— Dagmar Divjak 


Verb (imperfective perfective aspect) 


40 _omieszkaé 

41 planować zaplanować 
42 mieć 

43 zobowiązywać się zobowiązać sie 
44  _uwzigé się 

45 śmieć. 

46 dopomagać dopomóc 
47 rozpoczynać rozpocząć 
48 wstydzić sie_ 

49 zgadzać się_zgodzić się 
50 kusić się_skusić sie 

51 zalecać zalecić 

52 zapominać zapomnieć 
53 krępowaćsię_ 

54 potrzebować. 

55 bronić. 

56  raczyé raczyć 

57  silicsie 

58 nakazać nakazać 

59 zaczynać zacząć 

60 baćsię_ 

61 postanawiać postanowić 
62 potrafić potrafić 

63 uwielbiać uwielbić 

64 musieć. 

65  odwazaésie odwazyé sie 
66  usilowaé 

67 ważyć się odważyć sie 
68 doradzać doradzić 

69  pragnąć_ 

70  zdqzaé zdążyć 

71 prosić poprosić 

72 chcieć 

73  przyobiecywaé przyobiecać 
74 polecać polecić 


75 _zdołać 


Translation 

‘fail’ 

‘plan’ 

‘have to’ 

‘bind, pledge oneself 

‘set one’s mind, become crazy’ 
‘dare, venture’ 

‘help, aid, assist’ 

‘begin, start, commence’ 

‘be ashamed’ 

‘agree’ 

‘seek to obtain, attempt’ 
‘recommend, commend’ 
‘forget’ 

‘be embarrassed, feel uneasy’ 
‘need, want, be in need of 
‘defend, guard, vindicate, assert’ 
‘deign, condescend’ 

‘make efforts, exert oneself’ 
‘order, command’ 

‘begin, start, commence’ 


‘be afraid of, fear’ 


‘resolve, determine, make up one’s mind’ 


‘know how to do, manage’ 
‘adore, worship’ 

‘be obliged to, have to’ 

‘dare, venture’ 

‘make efforts, endeavor, attempt’ 
‘dare, venture’ 

‘advise’ 

‘desire’ 

‘manage to do (on time)’ 

‘ask, beg, request’ 

‘want, be willing, intend, desire, wish’ 
‘promise’ 

‘recommend’ 


‘be able’ 


Verb (imperfective_perfective aspect) 
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Translation 


76 myśleć ‘think, mean’ 

77  zamierzaéc zamierzyc ‘intend, mean, be going’ 
78 | wahać się_zawahać się ‘hesitate, weaver’ 

79 umożliwiać umożliwić ‘enable, make possible’ 
80 lękaćsię_ ‘fear, be anxious’ 

81 kwapićsię_pokwapić sie *be eager 

82  ofiarowywaé sie ofiarowaé sie ‘offer (oneself) 

83 spodziewać sie *hope, expect 

84 uczyć nauczyć ‘teach, instruct’ 

85  podejmowaé się_podjąć sie ‘undertake’ 

86  kontynuowaé. *continue' 

87 lubić. ‘like, love’ 

88 przestawać przestać ‘cease, stop, discontinue’ 
89 szykować się_przyszykować sie ‘prepare (oneself) 

90 przykazywać_przykazać ‘order, command’ 

91 _zaofiarować sie *offer (oneself)' 

92 namawiać namówić ‘induce, persuade’ 

93 rozkazywać rozkazac ‘order, command’ 

94 przywykać przywyknąć ‘get accustomed to’ 

95  zenowaé się_ 'feel embarrassed' 


Appendix 2: Example questionnaire 


Trigger sentences for each of the verbs in each of the three constructions were 
composed. To ensure naturalness as much as possible, the sentences were 
adapted from authentic sentences from the non-literary text sections from the 
PNC. The raw material for the sentences was extracted from the test version of the 
PNC (66 million words). Raw sentences were taken from written periodicals. If no 
examples were found, both dictionaries and (near-)native speakers were con- 
sulted. The sentences were then altered to contain the test constructions. To en- 
sure comparability, every trigger item consisted of two sentences that formed a 
whole and could stand alone, i.e., were not context dependent. All sentences are 
declarative statements. Positive sentences were used unless there was a clear 
counter indication that the verb favored negative contexts. Sentence subjects are 
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male/female third person singular/plural. Finite verbs are past and perfective (if 
possible). Infinitives are proportional to ‘do something’. 


The following is an example of one block. The capital letters A, B and C refer 


to the diagnostic tests (the thing-, that- and time-tests respectively). Small letters 
a, b and c refer to the lexical set, while numbers identify the verb. The capital 
letter F indicates filler sentences. 


Ac42 Mieszkańcy Kołobrzegu mieli jeść, spać i oglądać telewizję w blokach 
poza centrum. Mieli to, aż nie naprawili przewodu gazowego w centrum. 

‘The inhabitants of K had to eat, sleep and watch TV in apartment buildings 
outside the center. They had this, until they fixed the gas pipes in the center.’ 
[example of an infelicitous thing- test] 

F8 FBI prowadziło operację specjalną. Prowadzono operację w tak głębokiej 
tajemnicy, Ze w pewnym momencie nawet sam prezydent nie był do końca poin- 
formowany. 

F11 Demokracja to dla wielu ludzi rzecz oczywista o której nie myślą. Nie 
wiedzą co to jest żyć w dyktaturze. 

Ba23 Jest w złym humorze, bo nawykł urlop spędzać w Kalifornii. Jak człowiek 
już nawykł, żeby spędzać urlop w słonecznym miejscu, to polskich deszczowych 
lat nie uwielbia. 

‘He is in a bad mood, because he is used to spending his holidays in Califor- 
nia. Once you are used to spending your holidays in a sunny place, you no 
longer love Polish rainy years.’ [example of a felicitous that-test] 

F1 Berlin byt miastem podzielonym murem. W 1989 roku ludzie z obu stron 
zaczeli rozwalaé mur. 

F17 Sztucer to broń myśliwska na grubego zwierza. Zawsze brat właśnie sztucer 
kiedy chodził na polowania. 

Cc92 Po wyborach był całkiem rozczarowany. On był jednym z tych, którzy pod- 
czas kampanii wyborczej namówili członków zespołu w dzień wyborów 
wesprzeć Kerry'ego do Białego Domu. 

‘The elections had left him completely disappointed. He was one of those 
who during the election campaign had talked members of the team into sup- 
porting Kerry into the White House on election day.’ [example of a felicitous 
time-test] 

F25 Zgodnie z prawem ksiecia chronit królewski immunitet. Tylko królowa 
mogla go zdecydowaé o ukaraniu go jak normalnego obywatel 
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Appendix 3: Contents of each of the seven 
clusters supported by K-means analysis on 


summated responses 


Verb in cluster 1 


decydowaé sie zdecydowaé sie 
pozwalać_pozwolić 

zgadzać się_zgodzić sie? 
proponować_zaproponować 
bać się_ 

godzić się_ 

zalecać _zalecić 
przykazywać_przykazać 
bronić_ 

namawiać_namówić 
zamierzać_zamierzyć 
zezwalać zezwolić 
dopomagać_dopomóc 
wahać się_zawahać się 


zakazywać_zakazać 


Verb in cluster 2 


spieszyć_pospieszyć 
umożliwiać_umożliwić 
krępować się _ 
spodziewać się _ 
pragnąć_ 
potrzebować_ 
nawykać_nawyknąć 
_uwzigé sie 

chcieé_ 


kwapić się _pokwapić się 


Translation 


‘determine, decide’ 


‘allow, permit, let’ 


‘offer, propose’ 

‘be afraid of, fear’ 

‘agree, consent’ 
‘recommend, commend’ 
‘order, command’ 

‘defend, guard, vindicate, assert’ 
‘induce, persuade’ 
‘intend, mean, be going to’ 
‘allow, permit, let’ 

‘help, aid, assist’ 
‘hesitate, weaver’ 

‘forbid, prohibit’ 


Translation 


‘hurry, be in a hurry’ 

‘emable, make possible’ 

‘be embarrassed, feel uneasy’ 

‘hope, expect’ 

‘desire’ 

‘need, want, be in need of 

‘become accustomed’ 

‘set one’s mind, become crazy’ 

‘want, be willing, intend, desire, wish’ 


‘be eager’ 


2 Translations are missing if they were not included in Polish-English dictionaries. 
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Verb in cluster 2 


_omieszkaé 


przyobiecywać_przyobiecać 
podejmować _podjąć się 
wzbraniać _wzbronić się 
zapominać_zapomnieć 
brzydzić się _ 

_zdołać 

zobowiązywać _zobowiązać się 


radzić_poradzić 


przyrzekać_przyrzec 


Verb in cluster 3 


doradzać_doradzić 


rozkazywać_rozkazać 
planować_zaplanować 
życzyć[sobie]_zazyczyć[sobie] 


ofiarowywać_ofiarować 


przysięgać_przysiąć 


Verb in cluster 4 


uwielbiać_uwielbić 


ubóstwiać_ 

kochać_ 

ważyć _odważyć się 
szykować _przyszykować się 
lubić_ 

nienawidzić_ 

ośmielić _ośmielić się 
umieć_ 

odważać _odważyć się 


prosić_poprosić 


‘as, beg, request’ 


Translation 

‘fail’ 

‘promise’ 

‘undertake’ 

‘forbid’ 

‘forget’ 

‘abhor, loathe, have an aversion’ 
‘be able’ 

‘bind, pledge oneself’ 

‘advise’ 


‘promise’ 


Translation 


‘advise’ 
‘order, command?’ 
‘plan’ 


‘wish, desire’ 


‘swear’ 


Translation 


‘adore, worship’ 
‘idolize, adore’ 
‘love’ 


‘dare, venture’ 


‘like, love’ 

‘hate, detest’ 
‘venture, dare’ 
‘know how, be able’ 


‘dare, venture’ 


Verb in cluster 5 


kończyć_skończyć 
przestawać_przestać 
poczynać_począć 
dawać _dać się 

mieć_ 

musieć_ 

usiłować_ 

kusić _skusić się 
rozpoczynać_rozpocząć 
kontynuować_ 

móc_ 
dokańczać_dokończyć 
zdążać_zdążyć 
zaczynać_zacząć 
zamyślać_zamyślić 
żenować się _ 
raczyć_raczyć 

silić się_, 


smieć_ 


Verb in cluster 6 


obiecywac[sobie] obiecac[sobie] 
uczyć_nauczyć 

polecać_polecić 

lękać się_ 

uczyć _nauczyć się 
przywykac_przywyknąć 
postanawiac_postanowic 
ślubować_Śślubować 
nakazać_nakazać 

obawiać się _ 


przyzwyczajać _przyzwyczaić się 
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Translation 


*end, finish, conclude, close' 
‘cease, stop, discontinue’ 


‘begin, originate’ 


‘be obliged to, have to’ 
‘make efforts, endeavor, attempt’ 
‘seek to obtain, attempt’ 
‘let, allow’ 

‘have to’ 

‘can, be able’ 

‘finish up, conclude’ 
‘manage to do on time’ 
‘begin, start, commence’ 
‘design’ 

‘feel embarrassed’ 

‘deign, condescend’ 

‘make efforts, exert oneself’ 


‘dare, venture’ 


Translation 


‘promise’ 

‘teach, instruct’ 
‘recommend’ 
‘fear, be anxious’ 
‘learn’ 


‘get accustomed to’ 


‘resolve, determine, make up one’s mind’ 


‘vow, make a vow’ 
‘order, command’ 
‘fear, be afraid, be anxious’ 


‘become accustomed, get used to’ 
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Verb in cluster 7 Translation 

starać postarać sie ‘endeavor, make efforts, take pain, try’ 
próbować spróbować ‘try, test, attempt’ 

kazać kazać ‘bid, order, let’ 

myśleć_ ‘think, mean’ 
potrafić_potrafić ‘know how to, manage’ 
_zaofiarować się 

pomagać_pomóć ‘help, aid, assist’ 

wstydzić się _ ‘be ashamed’ 

woleć_ ‘prefer’ 

pamietać_ ‘remember, keep in mind’ 
_zechcieć ‘become willing’ 
zabraniać zabronić ‘forbid, prohibit, interdict 


poważać_poważyć sie 


Volker Gast, Maria Koptjevskaja-Tamm 
The areal factor in lexical typology 


Some evidence from lexical databases 


Abstract: Our study aims to explore how much information about areal patterns 
of colexification we can gain from lexical databases such as CLICS and ASJP. We 
adopt a bottom-up (rather than hypothesis-driven) approach, identifying areal 
patterns in three steps: (i) determine spatial autocorrelations in the data, (ii) iden- 
tify clusters as candidates for convergence areas and (iii) test the clusters result- 
ing from the second step controlling for genealogical relatedness. Moreover, we 
identify a (genealogical) diversity index for each cluster. This approach yields 
promising results, which we regard as a proof of concept, but we also point out 
some drawbacks of the use of major lexical databases. 


Keywords: areality, colexification, lexical database, lexical typology 


1 Introduction 


1.1 Lexical typology and areal linguistics 


The lexicon is arguably one of the most difficult domains for cross-linguistic and 
typological generalizations. It is much more loosely structured than, for instance, 
sound or tense systems and its structure is hardly reflected in linguistic form. At- 
tempts at identifying principles of lexical organization, for example, with sense 
relations in the structuralist tradition or through prototypes and family resem- 
blances therefore mostly rely on diagnostic tests and intuitions, even in the anal- 
ysis of individual languages (e.g., Lipka 1992; Cruse 1986; Kleiber 1990; Geeraerts 
2010). Comparing the lexicons of different languages obviously poses an even 
greater challenge and some would say that such comparison is not even possible, 
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as word meanings are only defined relative to the systems they form part of (cf. 
Evans 2011: §1.3 for some discussion). Despite such difficulties, some non-trivial 
generalizations have been formulated about the organization of lexicons. Most of 
the insights concern individual domains of meaning, such as color terms (Berlin 
and Kay 1969 and follow-up work), verbs of motion (Talmy 1985 and follow-up 
work), kinship terms (e.g., Nerlove and Romney 1967; Dahl and Koptjevskaja- 
Tamm 2001), body parts (e.g., Majid, Enfield and Van Staden 2006; Brown 2013a, 
2013b), perception (e.g., Viberg 1984; Evans and Wilkins 2000; Vanhove 2008a), 
temperature terms (Koptjevskaja-Tamm 2015) and verbs of aqua-motion 
(Koptjevskaja-Tamm, Divjak and Rakhilina 2010), to name just a selection of the 
domains that have figured prominently in this branch of typology. The findings 
of lexical-typological studies often take the form of “holistic characterizations” 
(e.g., “language x is a satellite-framed language” in terms of Talmy 1985) or im- 
plicational relations (e.g., “if a language has three basic color terms, one of them 
is ‘red’”; Berlin and Kay’s 1969 stage II). 

Broader generalizations about possible systems of lexical domains have of- 
ten been represented using the semantic map methodology, particularly in the 
domain of function words such as indefinites (e.g., Haspelmath 1997; van der Au- 
wera and Van Alsenoy 2011) and impersonal pronouns (e.g., van der Auwera, 
Gast and Vanderbiesen 2012; Gast and van der Auwera 2013) but also in other, 
“more lexical” domains (cf. Koptjevskaja-Tamm, Rakhilina and Vanhove 2015 for 
an overview of implicational and probabilistic semantic maps in lexical typology; 
Rakhilina and Reznikova 2016 on the use of semantic maps in the frame-based 
approach to lexical typology). Semantic maps can be regarded as networks (tech- 
nically, graphs; cf. Gast and van der Auwera 2013) representing patterns of mul- 
tifunctionality manifested by semantically/functionally “comparable” linguistic 
expressions (e.g., morphemes, words, constructions) of particular languages, 
where the main guiding principle is the “contiguity/connectivity requirement”. 
More specifically, functions (uses, meanings, contexts) that are often associated 
with one and the same linguistic expression, represented as “nodes” (or “verti- 
ces”) in the graph, are connected by “edges” or they cover a contiguous region 
on a semantic map. 

While semantic maps may be used to represent the range of meanings asso- 
ciated with any kind of linguistic expression, the term “colexification”, coined by 
Francois (2008) and since then widely adopted, specifically targets the expres- 
sion of two (supposedly different) concepts with one word. Its major advantage 
is that it is non-committal with respect to why two concepts are expressed with 
the same linguistic element. Polysemy is certainly the most interesting case in 
cross-linguistic studies but colexification may also emerge for other reasons, 
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such as the reanalysis of scope relations (cf. the case of impersonal pronouns dis- 
cussed in Gast and van der Auwera 2013). Colexification is also less rigorous when 
it comes to the contiguity/connectivity requirement insofar as linking elements 
on semantic maps may sometimes be lost in historical developments, leaving us 
with two words that do not cover a contiguous space on the map but are never- 
theless colexified (cf. van der Auwera and Temürcü 2006; van der Auwera and 
Van Alsenoy 2013). 

The concept of colexification can therefore be fruitfully used for the purposes 
of areal linguistics and areal typology (e.g., Urban 2012; Koptjevskaja-Tamm and 
Liljegren 2017). As is well-known, languages do not only borrow “matter” (e.g., 
loan words), but also “patterns” (Matras and Sakel 2004). A particularly common 
type of pattern transfer has been called “polysemy copying” (Heine and Kuteva 
2003, 2005) or “distributional assimilation” (Gast and van der Auwera 2012): As 
a result of “interlingual identification”, linguistic elements from contact lan- 
guages may assimilate their distributions. The results of such transfer can further 
be expected to be reflected in areal patterns of distribution. In fact, some colexi- 
fication patterns are cross-linguistically frequent, such as the colexification of 
‘finger’ and ‘toe’. Others show a genetically and/or areally restricted distribution, 
such as the colexification of ‘eat’ and ‘drink’ in many Papuan and Australian lan- 
guages (Aikhenvald 2009), as well as in some other languages of the world (for 
further examples, see Vanhove 2008b; Urban 2012; Juvonen and Koptjevskaja- 
Tamm 2016). Still others are very local or even language-specific, such as ‘beef’ 
expressed as ‘big meat’ in some of the languages of Hindukush - the mountain- 
ous region comprising northern Pakistan, northeastern Afghanistan and the 
northern-most part of Indian Kashmir (Koptjevskaja-Tamm and Liljegren 2017). 
It has been shown, even before the notion of colexification came into use, that 
there are clear areal patterns in, for example, the distribution of languages dis- 
tinguishing specific color terms (cf. Kay and Maffi 2013a, 2013b) and the body 
parts ‘arm’, ‘finger’ and ‘hand’ (cf. Brown 2013 a, 2013b). There is therefore a chal- 
lenge to both identify such areal patterns and to provide explanations for them. 

A systematic cross-linguistic study of colexification patterns needs a lot of 
data. Most of the relevant investigations have been based on retrieving dictionary 
data and/or elicitation, both of which are very time-consuming, which explains 
why this research has so far been relatively restricted; it would require develop- 
ing new and more efficient methods of data collection. For instance, the range of 
data for the comparative study of lexical-semantic patterns can be broadened by 
relying on parallel texts. This approach, pursued by Ostling (2016), among others, 
represents a highly promising direction for future research, specifically when the 
range of texts and registers can be broadened. At present, most studies using 
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“massively parallel corpora” (Cysouw and Walchli 2007) rely on the Bible and, 
consequently, on a rather specific (written) register, which introduces a certain 
bias in terms of both the topics covered and the vocabulary used. For instance, 
the richness of kinship terminology found in some parts of the world is obviously 
not retrievable from Bible texts. Still, parallel texts clearly provide a useful re- 
source for the cross-linguistic study of lexical semantics that should be explored 
further. 

The recent development of typological resources with a global coverage of 
data has made the study of a broader range of colexification patterns possible. 
For example, the Database of cross-linguistic colexifications (CLICS version 1.0; 
see List et al. 2014) provides an interface for the visualization and graphical in- 
spection of colexification patterns in a sample of 221 languages. Such databases, 
and their potential (as well as limitations), constitute the main topic of this con- 
tribution. In addition to the CLICS database, which has been designed specifi- 
cally for the study of colexification, we will use data from another lexical data- 
base for comparison, the database of the Automated similarity judgment 
programme (ASJP; see Wichmann, Holman and Brown 2016). 


1.2 Main questions addressed in this study 


The main question addressed in this contribution can be formulated as follows: 
what can we learn from cross-linguistic lexical databases such as CLICS and ASJP 
to gain a better understanding of global patterns of lexical organization and their 
areal distributions? As outlined in Koptjevskaja-Tamm and Liljegren (2017), at 
least the following groups of lexico-semantic phenomena may serve as indicators 
of areality: 


-  lexico-semantic parallels — shared colexification patterns and/or shared lex- 
ico-constructional patterns/calques, such as the colexification of ‘fruit’ and 
‘child’, or ‘fruit’ being expressed as ‘child of tree’ across many West African 
languages, both cases involving a semantic association between ‘child’ and 
‘fruit’; 

— shared formulaic expressions, such as the farewell expressions au revoir 
(French), auf Wiedersehen (German), pd áterseende (Swedish), do svidanija 
(Russian) and ndkemiin (Finnish), which follow the same model across a 
number of European languages; 

—  area-specific lexicalizations and a shared or similar-looking internal organi- 
zation of certain semantic domains, such as a highly specialized vocabulary 
describing dairy practices and dairy products across the languages of the 
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Greater Hindukush or the different areally defined patterns in the systems of 
phasal adverbials ‘still’, ‘no longer’, ‘not yet’ and ‘already’ in European lan- 
guages (see van der Auwera 1998a). 


Since the existing cross-linguistic lexical databases such as CLICS and ASJP are 
restricted in their data coverage, our main question will specifically target shared 
colexification patterns. In other words, we want to determine what information 
we can gain about colexifications from the databases. “Gaining information 
from” the databases basically comprises two complementary aspects: we can test 
existing claims and hypotheses about colexification patterns and we can explore 
new patterns that have not been identified so far. In this study, we pursue the 
latter approach. 

We would like to make it clear from the outset that our approach is entirely 
unprejudiced. Neither of us was involved in either project (CLICS or ASJP) and we 
are using the data in a way in which anyone else could use them as well. The 
study has an exploratory character and a certain methodological focus. The dis- 
cussion will therefore contain some critical remarks, which are not intended to 
call into question the merits of open-access resources like CLICS and ASJP, or the 
value of large-scale databases in general. Moreover, it should be borne in mind 
that we are using the data from these databases in ways that were not intended 
or anticipated by the creators. Any kind of limitation pointed out is thus not to be 
seen as criticism of the databases. 

Following some remarks on the data and methodology in Section 2, we pre- 
sent some rather general observations about the scope and limits of the databases 
in Section 3. In Section 4, we present a bottom-up approach to the identification 
of areal colexification patterns, as well as some results obtained in this way. Sec- 
tion 5 contains some conclusions. 


2 Remarks on the data and methodology 


2.1 Lexical databases used in the study 


CLICS is an online database of colexifications (called “synchronic lexical associ- 
ations” on the homepage)’ with data from 221 language varieties of the world. It 
draws on four types of (digital) resources (see Mayer et al. 2014): 


1 http://clics.lingpy.org/main.php (accessed 1 June 2017). 


48 — Volker Gast, Maria Koptjevskaja-Tamm 


- the Intercontinental dictionary series (IDS; Key and Comrie 2015), which 
emerged from a long-term project that started in the 1980s and implied the 
compilation of word lists (by experts) for the expression of 1,310 concepts in 
233 languages - for CLICS, a reduced set of (cleaned-up) data from 178 lan- 
guages was used; 

- the World loan word database (WOLD; Haspelmath and Tadmor 2009), which 
contains a large amount of vocabulary (1,000-2,000 items) from 41 lan- 
guages, compiled by experts — data from 33 of the languages were included 
in CLICS; 

— the online dictionary LOGOS, from which the authors extracted data for four 
languages not represented in IDS or WOLD; 

- theSprákbanken research unit at the University of Gothenburg provides ten 
word lists of South Asian and Himalayan languages,’ six of which were used 
for CLICS. 


The database has been specifically developed for extracting information on co- 
lexifications across languages. As its creators explain, “[i]t is designed to serve as 
a data source for work in lexical typology, diachronic semantics, and research in 
cognitive science that focuses on natural language semantics from the viewpoint 
of cross-linguistic diversity. Furthermore, CLICS can be used as a helpful tool to 
assess the plausibility of semantic connections between possible cognates in the 
establishment of genetic relations between languages" (CLICS website)? 

Information on colexifications can be extracted in various ways. Using the 
query interface, it is possible to find out whether two specific concepts are linked 
in the language varieties in CLICS and to determine how many links to different 
concepts are reported for a specific concept. Users can also browse the concept 
networks that have been extracted from the data by the database creators and 
download parts of the data for large-scale quantitative investigations, which is 
what was done for the purposes of this study. 

Given the heterogeneity of sources, there are no consistent, language-inde- 
pendent definitions of the concepts represented in the database.’ As far as areal 


2 https://spraakbanken.gu.se/eng/research/digital-areal-linguistics/word-lists (accessed 11 
April 2017). 

3 http://clics.lingpy.org/main.php (accessed 11 April 2017). 

4 We will leave the discussion of labelling and (the absence of) definitions for the entries in the 
different lexical databases for a future occasion. 
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coverage is concerned, the authors make the following disclaimer (see CLICS 
website):° 


Coverage of the world’s languages in both IDS and WOLD is biased towards certain regions 
of the world. In the case of IDS, South American languages and languages of the Caucasus 
are overrepresented. In the case of WOLD, languages of Europe figure particularly promi- 
nently. Since it is possible and even expectable that certain polysemies in the lexicon are 
frequent or even restricted to certain areas of the world, we advise researchers interested in 
cross-linguistic diversity to take appropriate measures to rule out unwarranted generaliza- 
tions due to areal effects. 


The database of ASJP was created for the purposes of comparative historical lin- 
guistics, as a means for evaluating the similarity of words from different lan- 
guages with the same meaning and, ultimately, for classifying languages compu- 
tationally on the basis of the observed lexical similarities. It grew out of a 
collaboration of “25 professional linguists and other interested parties working 
as volunteer transcribers and/or extending aid to the project in other ways” (Wik- 
ipedia).° The database (version 17, April 2017) provides information on the expres- 
sion of mainly 40 concepts from the (100-item) Swadesh list in 4,664 languages 
and was collected with guidelines administered to the contributors." As the ASJP 
data is based on the Swadesh list, there is, as far as we can tell, no language- 
independent definition of the concepts. The English words thus function as a ter- 
tium comparationis. Given the sheer quantity of languages represented in the 
data, there is, inevitably, a certain amount of heterogeneity in it but it should be 
borne in mind that the project was coordinated by specialists (especially Cecil H. 
Brown, Seren Wichmann and Eric W. Holman) and the database continues to be 
curated. 


2.2 Extracting and processing the data 


As both CLICS and ASJP associate concepts with words, we can (automatically) 
identify colexifications through a simple comparison of words and their associ- 
ated meanings. For the CLICS data, we used the file “links.csv” from the “official” 
download link.* It contains 32,536 links (colexification types), each of them for a 


5 http://clics.lingpy.org/faq.php#datal (accessed 4 April, 2017). 

6 https://en.wikipedia.org/wiki/ Automated, Similarity Judgment Program&The ASJP Con- 
sortium (accessed 4 April 2017). 

7 http://asjp.clld.org/static/Guidelines.pdf (accessed 1 June 2017). 

8 http://clics.lingpy.org/download.php (accessed 1 June 2017). 
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certain number of languages. Altogether, it provides information about 91,673 in- 
stances of colexification in 221 languages. 

The file with the ASJP data’ contains 7,221 word lists. Nearly all consist of a 
40-item subset of the 100-item Swadesh list found to constitute the diachronically 
most stable items (Holman et al. 2008). Most of these 40-item lists are incomplete. 
In addition, there are a little over 300 full 100-item lists (also generally not com- 
plete), most of which hark back to the beginning of the project, as described in 
Brown et al. (2008), before the selection of stable items was made. Often, more 
than one word is listed for a given concept. We mainly made use of the data from 
the 40-item lists because of the scarcity of data for the 60 items not on that list 
(words for ‘feather’ and ‘bark’ constitute exceptions, see Section 4). 

As we wanted to keep things comparable and consistent, specifically with 
respect to geographical data, we mapped all the varieties to Glottolog codes. This 
led to some loss of information. The 7,221 word lists of the ASJP file were associ- 
ated with 4,675 ISO 639-3 codes. We lumped the data from different word lists 
with the same ISO 639-3 code. This may have led to a certain loss of accuracy in 
the data, as we may have mixed data from distinct language varieties. Some data 
was also lost because some mappings from ISO 639-3 codes to Glottolog codes 
were missing. In this way, we obtained information about 690 colexification 
types in 4,554 languages. The CLICS data was reduced for the same reason, leav- 
ing us with 4,064 colexification types from 196 languages. 

In order to identify the areal distribution of colexification patterns, we also 
need negative evidence, i.e., information about the absence of a colexification 
pattern. As CLICS focuses specifically on colexification, not differentiation, this 
information is not per se included in the database. We can, however, retrieve a 
certain amount of negative evidence from the data, as the colexification patterns 
associate pairs of concepts with sets of forms. The ASJP data provides information 
about specific form-meaning pairings. Whenever we had information about the 
form encoding a given concept and when the two forms corresponding to the 
members of a colexification pair were different, we assumed that the pair of con- 
cepts in question was differentiated in the given language. 

After adding cases of differentiation to our data, our (enriched) ASJP data- 
base contained 2,060,856 data points (a subset of the 4,554 languages multiplied 
by 690 colexification patterns) and the CLICS-database contained 92,805 data 
points (a subset of the 196 languages x 4,064 colexification patterns). Each data 


9 The file “dataset.tab” from http://asjp.clld.org/download (accessed 1 June 2017). 
10 The data frames can be downloaded at http://www.uni-jena.de/~mu65qev/data in csv-files 
(accessed 4 April 2018). 
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point is a quadruple of two concepts, a (Glottolog) language code and a value of 
“t” (true/colexified) or “f” (false/differentiated): <‘arm’, ‘hand’, russ1263, t>, for 
instance, says that ‘arm’ and ‘hand’ are colexified in Russian. The Glottolog code 
can, moreover, be mapped to geographical and genealogical information. 

As far as the types of concepts represented in the databases are concerned, 
the ASJP data covers a certain range of nominal concepts from the domains of 
body parts (e.g., ‘eye’, ‘ear’, ‘nose’), animals and plants (e.g., ‘louse’, ‘dog’, ‘fish’) 
and nature (e.g., ‘sun’, ‘star’, ‘water’), as well as ‘person’ and ‘name’. The data- 
base furthermore contains data on five verbs (i.e., ‘drink’, ‘die’, ‘see’, ‘hear’ and 
‘come’), two adjectives (i.e., ‘new’ and ‘full’), the numerals ‘one’ and ‘two’ and 
the pronouns ‘I’, ‘we’ and ‘you’. As pointed out above, we have also made use of 
some of the other material in the data, which covers a few additional basic con- 
cepts such as ‘feather’, ‘hair’ and ‘bark’. CLICS contains an overall much more 
varied set of concepts, including a high number of kinship terms, some of them 
highly specific (see the examples discussed in Section 4.5). 


3 What the databases can(not) do for us 


Our data indicates, for pairs of concepts, whether they are colexified or differen- 
tiated in a given language or variety. This is obviously a simplification. For exam- 
ple, many languages may have different words for ‘day’ and ‘night’ and still use 
‘day’ as a cover term. If two languages are shown to colexify a given concept, this 
does not necessarily mean that they do not have different words for the individual 
concepts as well. ‘Arm’ and ‘hand’ provide a well-known problem in this respect, 
as many languages (e.g., Slavic ones) have a cover term for both body parts while 
also having more specific terms for each part. For example, Russian ruka covers 
both the arm and the hand but there is a more specific word for ‘hand’ as well, 
i.e., kist’, which is used much more rarely and only in specific contexts. There is 
thus both colexification and differentiation. We assume that the data in the data- 
bases, by and large, represent the “preferred” patterns of colexification or differ- 
entiation in the languages in question but it should be clear that the distinction 
between the two cases is an idealization (note that this does not only apply to the 
data used for this study but also to World atlas of language structures maps like 
those of Brown 2013a, 2013b). 

Moreover, we should bear in mind that colexification is a very general term, 
as pointed out above, and that, unlike in the semantic map methodology, it does 
not necessarily indicate immediate or exclusive relatedness of two concepts. This 
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point can be illustrated with patterns that emerged from an inspection of ex- 
tremely rare cases of colexification. For instance, the colexification pair <‘daugh- 
ter-in-law (of a woman)’, ‘father-in-law (of aman)’>, found in only two languages, 
seems quite remarkable. Closer inspection of the data shows, however, that both 
cases of this type — one from WOLD, one from the IDS - are instances of extremely 
general kinship terms, rather than unusual cases of polysemy. Swahili 
(swah1253) mkwe, for example, roughly means ‘in-law’ and is thus not only used 
for the two concepts mentioned above but also for many other in-law relation- 
ships. Similarly, Polci (polc1243) kwam is used for all types of in-law relations 
holding between contiguous generations, according to the information given in 
the IDS. What this shows is that we should always look at broader patterns of 
multifunctionality before jumping to conclusions about pairwise meaning asso- 
ciations, specifically when we explore the data in a bottom-up fashion as in- 
tended in the present study. 

Finally, we will briefly address the questions of areal/lexical typology for 
which neither of the two databases can give us any information (and for which 
neither of them was made). CLICS and ASJP do not allow us to extract any infor- 
mation on shared lexico-constructional patterns or formulaic expressions. They 
only associate linguistic forms with concepts, so we can only identify pairs of 
concepts that are expressed by exactly identical linguistic forms, with the linguis- 
tic forms (lexemes) not being further analyzed. This identity of lexemes in syn- 
chrony corresponds to Francois’s (2008: 171) notion of “strict colexification”. Co- 
lexification and semantic associations can, however, be understood more 
broadly - both diachronically, as two concepts being expressed by the same lex- 
eme at different periods in its history, and panchronically, as linked to each other 
by derivation, composition or other constructions. To give one example: while 
‘eye’ and ‘eyelid’ in Khasi (khas1269) and in Kumyk (kumy1244) are expressed by 
the same form (fiiuhmat in Khasi, kóz in Kumyk), there are plenty of languages, 
including English, where ‘eyelid’ is expressed by a compound involving ‘eye’. 
Such cases of “loose colexification" (Francois 2008: 171), involving various kinds 
of semantic shifts are, arguably, much more difficult to represent and identify in 
databases. A laudable attempt to systematize cross-linguistically recurrent pat- 
terns of loose colexification is represented in the Catalogue of semantic shifts in 
the languages of the world at the Institute of Linguistics of the Russian Academy 
of Sciences in Moscow. It is a database that currently contains more than 3,000 
semantic shifts found in 319 languages (see Zalizniak 2008; Zalizniak et al. 2012) 
but the organization of the database does not make it possible at present to draw 
any (statistical) conclusions on the distribution of these patterns across the lan- 
guages of the world. In this paper, we will therefore only use data from CLICS and 
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ASJP, restricting ourselves to cases where the databases associate two different 
concepts with exactly the same lexeme. 


4 Areal clusters of colexification patterns 


One of our intentions in using the data from CLICS and ASJP was to detect new 
patterns and generate new hypotheses about areal clusters of colexification pat- 
terns in a bottom-up approach. Our starting point is a map like the one in Figure 
1, which shows the distribution of languages colexifying and differentiating 
‘feather’ and ‘hair’ (see also Urban 2012 for this pattern) in the ASJP data. A black 
square stands for colexification, a red/empty circle for differentiation. Note that 
we are using this example because the relative small number of data points al- 
lows us to illustrate the method more easily. The noun 'feather' is contained in 
the Swadesh list but it is not among the 40 core items used for ASJP, so the num- 
ber of data points is relatively small, and certain regions are heavily underrepre- 
sented (e.g., Africa). The full dataset of the (40-item) ASJP project has worldwide 
coverage (see the remarks made in Section 2.2). 


Fig. 1: Colexification (black squares) versus differentiation (red/empty circles) of ‘feather’ and 
*hair' in the ASJP data 


In order to identify areal clusters of colexifications, we proceeded in three steps: 


- identification of areally biased colexification patterns; 
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— identification of areal clusters in the biased patterns as candidates for “clus- 
ter areas”, i.e., areas that are characterized by a given colexification pattern; 
— testing the cluster areas controlling for genealogical relatedness. 


4.1 Identifying areally biased colexification patterns 


In order to identify areally biased colexification patterns, we used the Join Count 
statistic (Cliff and Ord 1981), which is commonly applied to test for spatial auto- 
correlations in binary data." Assume that there is a grid of, say, eight by eight 
cells and all of the cells are either black or white. If black and white cells are dis- 
tributed over that grid as on a chessboard, no cell has a (horizontal or vertical) 
neighbor of the same color - there are no (same color) “joins” at all. In this case, 
there is a “negative autocorrelation”. If, by contrast, all the black cells are on the 
left side of the grid and all the white cells on the right side, there are many joins 
between cells of the same color - to be precise, 52 black-black joins and 52 white- 
white joins, as against eight black-white joins (in the middle of the board). In this 
case, there is a positive spatial autocorrelation. The Join Count statistic compares 
the observed number of same-color joins to the number expected on the basis of 
arandom distribution of colors. It is defined as the observed frequency minus the 
expected frequency of identical joins, divided by the standard deviation of the 
expected frequency. The statistic indicates a direction of the correlation (positive 
or negative) and we can calculate a p-value for it (i.e., a value that indicates the 
probability of finding the distribution in question under the hypothesis that the 
colors are distributed randomly). 

In order to apply the Join Count test to linguistic data,” we have to transform 
our data points into a grid or network of *neighbors". This implies that we have 
to decide what languages count as neighbors. As we are dealing with questions 
of language contact, “being a neighbor" should mean “potentially being in con- 
tact with each other". Obviously, it is hard to generalize over distances making 
language contact (im)possible, as this varies with the local habitat (e.g., sea ver- 
sus mountains). Moreover, the distance has to be adapted to the density of data 
points, i.e., we need larger distances for sparser data. We experimented with dis- 


11 Note that spatial autocorrelation is a common problem in many natural sciences such as bi- 
ology and geography. As in linguistic typology, data points are often influenced by neighboring 
data points. 

12 We used the function ‘joincount()’ from the R package spdep (see Bivand, Hauke and Kos- 
sowski 2013; Bivand and Piras 2015). 
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tances between 500 km and 2,000 km and - comparing the results with the clus- 
tering applied at the second stage — found that a neighbor distance of 2,500 km 
was a reasonable choice for the CLICS data and a distance of 1,000 km for the 
ASJP data. On the basis of these distances, the maps were transformed into net- 
works of neighbors, as illustrated in Figure 2 for Mesoamerica.” Note that only 
those languages are shown for which we have information on colexification or 
differentiation for the pair <‘feather’, ‘hair’>. 


Fig. 2: Neighbor network for languages of Mesoamerica and Central America 


4.2 Identifying clusters 


For those colexification patterns which showed an areal bias (positive autocorre- 
lation) according to the Join Count test (at a p-value « 0.05), we determined clus- 
ters using hierarchical cluster analysis on the basis of a (geographical) distance 
matrix." We distinguished three types of clusters, defined on the basis of their 
(maximal) distance to a neighboring cluster, i.e., *micro-clusters" (2,000 km), 
*meso-clusters" (4,000 km) and *macro-clusters" (6,000 km). 


13 For the identification of neighbors, we used the function ‘dnearneigh()’ of the spdep package 
for R (Bivand, Hauke and Kossowski 2013, Bivand and Piras 2015). 

14 As we are dealing with geographical data, we used the function ‘distm()’ from the ‘geo- 
sphere'-package for R (Hijmans 2016), rather than the native ‘dist()’-function of R. By default, the 
function uses the Haversine Great Circle method (fun-distHaversine). For the cluster analysis, 
we used the native ‘hclust()’-function of R, with the linkage method ‘complete’ (for “round” clus- 
ters). 
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Fig. 3: Dendrogram for the colexification pair <‘feather’, ‘hair’> (meso-clusters) 
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The results of such cluster analyses are commonly represented in the form of den- 
drograms, as shown in Figure 3. The dendrogram shows the grouping of the lan- 
guages exhibiting colexification of ‘feather’ and ‘hair’, represented by their Glot- 
tolog codes at the bottom, into meso-clusters. The y-axis of the diagram 
(“height”) indicates distances between nodes. The splits between groups of lan- 
guages in Figure 3 are based on the maximum distances between pairs of ele- 
ments from sisters in the tree. As the figure shows, the clustering of the relevant 
languages into meso-clusters, the cut-off distance is (by definition) 4,000 km. 
The (red) boxes at the bottom illustrate the clusters emerging from this cut-off 
point (the upper edge of any box is located at that “height”, i.e., the distance). 
Note that the critical distances are upper boundaries, so different cut-off points 
may deliver the same clusters. The largest distance between pairs of elements 
from a cluster is thus 2,000 km, 4,000 km or 6,000km, depending on the type of 
cluster.^ The geographical locations of the five clusters emerging from Figure 3 
are shown in Figure 4. The dotted circles are positioned around the geographical 
center of a cluster’ and their radius corresponds to the largest distance of any 
one cluster language to the center. The areas indicated by the circles enclosing a 
cluster will be referred to as the “cluster area" in each case. Each cluster area has 
a numerical identifier for the purpose of data processing and textual reference. 
The cluster areas determined on this basis will serve as hypotheses for linguistic 
contact areas characterized by the relevant colexification patterns (potentially 
among other features, of course). 


15 Wealso applied model-based clustering, where the cut-off points between clusters are deter- 
mined on the basis of specific diagnostics of the relevant models, such as the Bayesian Infor- 
mation Criterion, as in the function *mclust()' of the R package mclust (Fraley and Raftery 2002; 
Fraley et al. 2012). This approach yielded highly heterogeneous results in terms of scaling, how- 
ever, so comparability of clusters was compromised. 

16 Thecenters and radiuses were calculated by mapping the geographical coordinates to a Car- 
tesian coordinate system, determining the relevant data and mapping them back to geographical 
coordinates. 
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Fig. 4: Hypothesized meso-clusters for the colexification pattern <‘feather’, ‘hair’> 


4.3 Testing and analyzing clusters 


Obviously, a (hypothesized) cluster area may have emerged for reasons that are 
independent of the geographical location of the languages concerned, in partic- 
ular, as it may just reflect the genealogical relatedness of neighboring languages 
with inherited colexification patterns. Given that we are most interested in clus- 
ters that are areally conditioned (i.e., clusters that emerged through language 
contact or some other geographical factor), we determined the influence of the 
independent variable “membership of a language L to a given cluster area” on 
the dependent variable “presence of a colexification pattern in L”, controlling for 
genealogical relationships. We fitted (for each colexification pattern and cluster 
size, i.e., micro, meso and macro) a mixed effects model, treating the (highest- 
level) language family as a random effect." The data was pre-filtered and we only 
ran regression analyses for colexification patterns with at least 20 TRUE cases for 


17 We used Bayesian logistic regression as implemented in the function MCMCGlmm( of the 
MCMCGImm-package for R (Hadfield 2010), with weakly informative prior assumptions (Gelman 
et al. 2008). The main reason for this choice was the structure of the data, with many systematic 
cases of complete separation. We compared the results from Bayesian regression with frequentist 
(mixed-effects) methods and inspected the results for *sanity", using binned residual plots (Gel- 
man and Su 2016). So-called Gelman priors assume a Cauchy distribution with center O and scale 
2.5. The MCMCglmm package offers a function ‘gelman.priors()’, which we used for this purpose. 
The number of iterations was set to 130,000, with 30,000 burnin iterations and a thinning inter- 
val of 10. 
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the ASJP data and 15 TRUE cases for the CLICS data. In each model, we included 
only data points from families such that one member of the family either exhib- 
ited the colexification pattern in question or was a member of one of the (hypoth- 
esized) cluster areas. The (Bayesian) regression model identifies a posterior 
mean-value for each cluster area (“pm”, a rough indicator of effect size) and a p- 
value (“p.pm”) showing the probability that the posterior mean is not higher than 
zero (and that membership to the cluster area thus has no effect). In the following 
discussion, we will mainly focus on the p-value. For the <‘feather’, ‘hair’>-pat- 
tern, the analysis delivered the values shown in Table 1. 


Tab. 1: Results of regression analysis for <‘feather’, ‘hair’> (meso-clusters) 


Colexification Cluster Long Lat Radius pm p.pm 
<feather, hair> 1 -91.2 16.0 937 1.19 0.16 
<feather, hair> 2 -64.5 -2.8 2,043 0.89 0.27 
«feather, hair» 3 106.2 16.5 544 2.75 «0.01 
«feather, hair» 4 8.0 10.7 1,380 4.29 « 0.001 
«feather, hair» 5 153.2 -3.3 1,470 1.54 0.054 


To provide a somewhat better idea of the internal make-up of clusters, Figure 5 
shows cluster areas 1? and 2 in more detail. 


18 The prevalence of the <‘feather’, ‘hair’> colexification in Mesoamerica has been noted earlier, 
among others by Smith-Stark (1994), who tests but also discards its applicability as a Mesoamer- 
ican areal trait (“lexical calque"). 
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Fig. 5: Cluster areas 1 and 2 for the colexification pair <‘feather’, ‘hair’> 


Membership to a cluster area may be a significant predictor even when the cluster 
is genealogically entirely homogeneous if there is a high number of non-cluster- 
members from the same family which do not exhibit the colexification pattern in 
question. The regression analysis only excludes that a cluster primarily contains 
languages from families whose members tend to exhibit the pattern inde- 
pendently of membership to the cluster. We therefore determined an indicator of 
the genealogical diversity within each cluster as well. We used Shannon’s diver- 
sity index (Simpson 1949) for this purpose, with a correction factor for small sam- 
ple sizes.” We determined this index only for those languages of a cluster area 
that showed the colexification pattern in question. In the <‘feather’, ‘hair’> clus- 
ters in Figure 4, we find different degrees of homogeneity, as Table 2 shows. 


Tab. 2: Diversity indices for <‘feather’, ‘hair’> (meso-clusters) (ASJP) 


Cluster area Diversity index 


1.01 
1.38 
0 
0.77 
0.44 


u FF WN e 


19 With the R DiversitySampler package’s ‘Hs()’-function (Lau 2012), with the option *corr-T". 


The areal factor in lexical typology — 61 


As Table 2 shows, cluster area 3 in Southeast Asia is totally homogeneous and it 
actually provides a nice illustration that significant clusters need not be genea- 
logically diverse. All languages showing the colexification pattern in question are 
Austroasiatic.” Given that, at the same time, all Austroasiatic languages showing 
the colexification in question are members of this cluster, membership to this 
cluster correlates positively with the colexification pattern in question. The ratio 
of languages to families is one, however, and the diversity index is therefore zero. 
By contrast, cluster area 2 (in South America) is the most heterogeneous one, as 
the six languages showing colexification of ‘feather’ and ‘hair’ belong to six dif- 
ferent families.” Cluster area 5 (in Melanesia) exhibits a relatively low degree of 
diversity (six languages from two families). This is illustrated in Figure 6, where 
families are identified with their Glottolog codes.” (It should, of course, be borne 
in mind that the density of data points is very low for this colexification pair, par- 
ticularly in Melanesia, and that the data is only used for illustrative purposes at 
this point.) 
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Fig. 6: Cluster areas 2 and 5 for <‘feather’, ‘hair’> 


20 Note that the degree of genealogical diversity of a cluster should ideally be measured not 
relative to top-level families but to lower-level branches. The Austroasiatic languages in cluster 
4 actually belong to four different branches of that family: Bahnar (bahn1262) and Jeh (jehh1245): 
Bahnaric, Kuy (kuyy1240): Katuric, Ksingmul (puoc1238): Khmuic, Chut (chut1247): Vietic. 

21 Viz., Jodi-Saliban (jodi1234), Kakua-Nukak (kaku1242), Nadahup (nada1235), Tupian 
(tupi1275), Pano-Tacanan (pano1259) and Aikaná (aika1237). 

22 Viz., Austronesian (aust1307) and Nuclear Trans New Guinea (nucl1709). 


62 —— Volker Gast, Maria Koptjevskaja-Tamm 


On the basis of the procedure described above, we can now identify clusters in a 
bottom-up fashion, generating some statistics that will allow us to estimate how 
interesting they are from the point of view of lexical typology and in an areal per- 
spective. Obviously, the best evidence for contact-induced clusters is provided by 
examples with a high degree of distinctiveness (which means that many lan- 
guages in the cluster exhibit the colexification in question and few non-cluster 
languages have it) and a high degree of genealogical diversity. We will now take 
a closer look at the clusters delivered by the data from ASJP and CLICS (Sections 
4.4 and 4.5). 


4.4 Clusters emerging from the ASJP data 


The procedure described in Sections 4.1 to 4.3 brought to light 23 colexification 
patterns showing a significant positive spatial autocorrelation and 120 cluster ar- 
eas that turned out to be significant predictors for a given colexification in the 
ASJP data, controlling for genealogical relatedness. The 23 colexification pairs 
are shown in (1), the 36 meso-clusters in Figure 7. 


(1) «T, ‘fish’>; «T, ‘we’>; «T, ‘you’>; <‘bark’, ‘skin’>; <‘blood’, ‘die’>; <‘bone’, 
‘die’>; <‘come’, ‘dog’>; <‘die’, 'eye'»; <‘drink’, ‘water’>; <‘ear’, ‘hear’>; <‘ear’, 
‘leaf’>; <‘ear’, ‘name’>; <‘eye’, ‘name’>; <‘feather’, ‘hair’>; <‘fire’, ‘tree’>; 
<‘horn’, ‘knee’>; <‘horn’, ‘tooth’>; <‘liver’, ‘two’>; <‘louse’, ‘we’>; <‘man’, 

, ‘tooth’>; <‘see’, ‘we’> 


, 


*person'»*, ‘<‘mountain’, ‘stone’>; «name 


Fig. 7: All meso-clusters of the ASJP data 
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We can now apply further filters in order to identify the most interesting cases of 
association between a colexification pattern and a cluster area. We can use the 
diversity index for that purpose. Table 3 shows the top ten of the clusters, ordered 
by the diversity index (all cluster areas are significant predictors at a 0.05 level, 
according to the regression model).” It also indicates (in this order) the position 
and radius of each cluster, the numbers of languages and families, the diversity 
index, the posterior mean and the p-value for the latter value. A list of all signifi- 
cant cluster areas is provided online.” The ten clusters in Table 3 correspond to 
four colexification patterns: <‘fire’, ‘tree’>, <‘mountain’, ‘stone’>, <‘ear’, "leaf? 
and <‘bark’, ‘skin’>. We will now take a closer look at each of these patterns. 


Tab. 3: Top ten clusters emerging from the ASJP data 


Colexification Size No Long Lat Radius No No Div pm p.pm 


Ing fam 

«fire, tree» meso 1 142.9 -7.8 2557 53 16 2.3 2.87 «0.001 
«mountain, macro 1 16.5 6.9 4,320 69 16 2.3 2.90 «0.001 
stone» 

«ear, leaf» macro 1 29.8 7.8 3,876 39 14 2.2 2.98 «0.001 
«ear, leaf» meso 1 30.1 8.6 2,670 38 13 2.1 3.24 <0.001 
<bark, skin> macro 1 -65.3 -7.2 2,466 14 11 1.9 1.86 0.033 
<ear, leaf> micro 2 35.0 4.9 921 30 10 1.9 3.04 <0.001 
<mountain, meso 2 133.1 -19.0 1,967 27 10 1.9 1.15 0.027 
stone> 

<mountain, meso 4 -64.8 -8.4 1,533 9 8 1.6 1.42 0.030 
stone> 

<fire, tree> micro 6 130.6 -12.1 822 8 7 1.5 1.76 0.019 
<mountain, meso 3 5.9 11.1 1,808 45 7 1.5 1.92 <0.001 
stone> 


23 Note that, in some cases, the same cluster areas were identified for different cluster sizes 
(e.g., meso and macro), as the cut-off points for clusters are maximal distances. In such cases, 
we obviously regarded the cluster as being of the smallest category. 

24 See http://www.uni-jena.de/~mu65qev/data/colex-tables/ASJP-clusters.htm (accessed 4 
April 2018). 
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4.4.1 <‘Fire’, ‘tree’> 


<‘Fire’, ‘tree’> is a well-known colexification pattern, also noted by Urban (2012) 
and Ostling (2016) and comprehensively discussed by Schapper, San Roque and 
Hendery (2016). We will therefore restrict ourselves to presenting our data. As has 
been shown by Schapper, San Roque and Hendery (2016), colexification (either 
strict or loose) of ‘fire’ and ‘tree’ is well-attested in the Sahul area comprising the 
languages of Australia, New Guinea and surrounding islands, though not to the 
extent that this had been suggested in earlier research. In fact, Schapper, San 
Roque and Hendery (2016) argue that it is much more common for the Sahul lan- 
guages to colexify ‘fire’ with ‘firewood’, to the exclusion of ‘tree’. However, since 
‘firewood’ is not in the Swadesh list, this pattern cannot be extracted from the 
ASJP data. As we will see in Section 4.5, some relevant information can be ob- 
tained from the CLICS data, however. The ASJP data is shown in Figure 8 (in the 
following, only cluster areas that are significant predictors for the colexification 
pattern in question are indicated). 


Fig. 8: Significant meso-cluster for <‘fire’, ‘tree’> in the ASJP data 


4.4.2 <‘Mountain’, ‘stone’> 


The colexification of ‘mountain’ and ‘stone’ has been discussed by Ostling (2016) 
and is pervasive in the data used by Urban (2012). As Urban (2012) points out, 
Buck (1949: §1.22) already noticed a certain affinity between the meanings ‘moun- 
tain’ and ‘rock’ in Indo-European (e.g., Goth hallus ‘rock’, Old Norse hallr ‘large 
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stone’ or ‘sloping’ as an adjective, probably related to Latin collis ‘hill’ and Lith- 
uanian kalnas ‘mountain’). Sometimes, such instances of colexification are prob- 
ably mediated by a word for ‘cliff (e.g., Old High German felis ‘rock’, Old Norse 
fjall ‘mountain’, Irish all ‘rock, cliff’). 

The macro-cluster shown in the second row of Table 3 covers the whole con- 
tinent of Africa. As Figure 9 shows, it is particularly prominent in two parts of 
Africa, i.e., a central area that can perhaps, roughly, be negatively defined as a 
*non-Afroasiatic" and *non-Bantu" belt? and the Kalahari Basin in the south (cf. 
Güldemann 2010). The whole continent of Australia constitutes a particularly 
clear case of a cluster. It is, however, genealogically less diverse than the cluster 
in Africa (the diversity index is 1.9 in Australia versus 2.3 in Africa; note that the 
Australian cluster is listed as a meso-cluster in Table 3). A relatively weak cluster 
was identified in South America. 
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Fig. 9: Significant macro-clusters for <‘mountain’, 'stone'» in the ASJP data 


It seems plausible to assume that the colexification of ‘mountain’ and ‘rock’ may 
be rooted, to a certain extent, in the physical environment of the speakers, at least 
in Australia, with rocky environments and/or arid regions. A common alternative 
colexification partner of ‘mountain’ is ‘forest’, specifically in regions with a rich 
vegetation (e.g., [Mexican] Spanish selva, originally ‘forest’ from Latin silva, is 
often used for ‘rain forest’ as well as ‘montane forest’; see Urban 2012). The Afri- 
can cluster cannot, of course, be explained in this way. Tom Güldemann (p.c.) 


25 We owe this observation to Tom Güldemann. 
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has hypothesized that the colexification of ‘mountain’ and ‘stone’ shown in Fig- 
ure 9 may represent a pattern that was widespread before the expansion of Af- 
roasiatic and Bantu languages - a ‘remnant areal pattern’ between the zones cov- 
ered by the two major families, as it were. We would obviously need other types 
of evidence (especially historical) to test this hypothesis. In any case, the “belt” 
of black squares in Figure 9 does not seem to correspond to an area that we could 
define positively — for instance, in terms of geographical characteristics or pat- 
terns of language contact. 


4.4.3 <‘Ear’, ‘leaf’> 


<‘Ear’, ‘leaf’> is a particularly common pattern in Eastern Africa, most of the lan- 
guages being located in Güldemann's (forthc.) *Nilotic-Surmic spread zone". As 
Table 3 and Figure 10 show, there is a highly distinctive (meso-)cluster in this 
area, with 39 languages from 14 families. Our method has also identified some 
weaker and much smaller clusters in other parts of the world, i.e., in the Americas 
and in Australia, but they are far less diverse than the African cluster and it is 
questionable whether clusters with two members (as in Australia) should be 
taken into consideration at all (we included them because even two-member clus- 
ters may have resulted from language contact). 


Fig. 10: Significant meso-clusters <‘ear’, ‘leaf’> in the ASJP data 
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Given the density of languages in the Nilotic-Surmic spread zone, it is hard to plot 
the specific languages or language families of the <‘ear’, ‘leaf’> cluster in this 
area. Figure 11 shows the Glottolog codes of the families: Afro-Asiatic (afro1255), 
Atlantic-Congo (atla1278), Central Sudanic (cent2225), Dizoid (dizo1235), Ta-Ne- 
Omotic (gong1255), Heibanic (heib1242) Khoe-Kwadi (khoe1240), Lafota 
(lafo1243), Maban (maba1274), Blue Nile Mao (maoo1243), Nilotic (nilo1247), Nu- 
bian (nubi1251), Songhay (song1307) and Surmic (surm1244). 
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Fig. 11: Macro-cluster <‘ear’, ‘leaf’> in Eastern Africa in the ASJP data — language families 


As we are not specialists of African languages, we cannot interpret the facts from 
Eastern Africa any further. Specifically, it would be interesting to know what 
other meanings are involved in this pattern. Moreover, it would be intriguing to 
see if there is any connection between the physical environment and this colexi- 
fication pattern (e.g., insofar as there are plants with ear-like leafs). While this 
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may seem a bit far-fetched, a cursory glance at materials available to us in fact 
points in that direction. According to Hellenthal (2010: 493), in Sheko (shek1245) 
(which is not in our data, but its relative Dizin [dizi1235] is), haay means both ‘ear’ 
and ‘leaf of ensete or yam’ (“ensete” is also known as “Ethiopian banana”) but 
there are other words for other types of leaves. What ensete and yam leaves have 
in common is that they are relatively large in comparison to their “hosts” and 
prominently stick out laterally (while still being curled up, the “top leaf” of an 
ensete is described with a different word in Sheko, i.e., mükmüri). This conjecture 
gets some support from the fact that the pattern is also found in the desert regions 
of Mexico and the Southern United States, where the agave (americana) and sim- 
ilar plants are widespread. We leave it to specialists, however, to explain the par- 
ticularly strong (areal) association of ‘ear’ and ‘leaf’ in this part of the world.” 


4.4.4 <‘Bark’, ‘skin’> 


Before we consider the colexification of ‘bark’ and ‘skin’ that emerged from our 
analysis, it should be pointed out that ‘bark’ is not in the 40-item word list actu- 
ally used for ASJP (while being in the 100-item Swadesh list). The number of data 
points is therefore considerably lower than for most other pairs (cf. the case of 
‘feather’ and ‘hair’ discussed in Sections 4.1 to 4.3). The two significant macro- 
cluster areas are shown in Figure 12. The cluster in South America is the “strong- 
est” one. 

What is remarkable about cluster area 1 is the genealogical heterogeneity of 
the languages exhibiting colexification of ‘bark’ and ‘skin’. As Table 3 shows, the 
cluster comprises fourteen languages from eleven families (note that some of the 
languages listed in the following are also mentioned by Urban 2012): Abipon 
(abip1241) is Guaicuruan; Masaka (aika1237) is Aikaná; Apinayé (apin1244) is Nu- 
clear-Macro-Je; Bororo (boro1282) is Bororoan; Cha'palaa (chac1249), Tsafiki 
(colo1256) and Guambiano (guam1248) are Barbacoan; Hixkaryána (hixk1239) is 
Cariban; Hupdé (hupd1244) is Nadahup; Minica Huitoto (mini1256) is Huitotoan; 


26 Guillaume Segerer (p.c.) has pointed out to us that one interesting aspect of the <‘ear’, leaf» 
colexification pattern is that, with a few exceptions, it seems to be absent from Niger-Congo. In 
his data, it is attested in Sheko (shek1245), the Omotic languages of Wolaytta (wola1242), Gofa 
(gofa1235), Dorze (dorz1235), Dawro (dawr1236), Haro (kach1284), Basketo (bask1236) [Ta-Ne- 
Omotic] and Dime (dime1235) [South Omotic], and the Central Sudanic languages of Proto-SBB, 
Modo (modo1248), Bongo (bong1285), Mangbetu (mang1394), Avokaya (avok1242), Kaliko 
(kali1312), Logo (logo1259), Lugbara (lugb1240) and Ma'di (madi1260). 
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Páez (paez1247) is Paez; Paraguayan Guarani (para1311) and Parakaná (para1312) 
are Tupian; and Tacana (taca1256) is Pano-Tacanan. 

Figure 12 shows that there is also a cluster in Melanesia where ‘bark’ and 
‘skin’ are colexified. It comprises seven languages from two families (Austrone- 
sian and South Bougainville) and is therefore not as distinctive as the cluster in 
South America. Remember, however, that the number of data points is very small 
for this part of the world, as ‘bark’ is not included in the 40-item ASJP list. 


Fig. 12: Macro-clusters for <‘bark’, ‘skin’> in the ASJP data 


The colexification of ‘bark’ and ‘skin’ has also been discussed, but actually 
discarded, as a potential lexical trait of Mesoamerican languages (cf. Smith-Stark 
1994). As Figure 12 shows, there is no significant cluster in this area, according to 
the ASJP data. This actually confirms Smith-Stark’s (1994) observation that ‘bark’ 
and 'skin' are not characteristic of that area, as they are also colexified in neigh- 
boring, non-Mesoamerican languages." However, our data suggests that, even 
within the cluster, the degree of colexification is limited. Figure 13 shows the hy- 
pothesized cluster area, identified by the hierarchical clustering process, and the 
languages with and without colexification of ‘bark’ and ‘skin’. 


27 For ways of quantifying membership to the Mesoamerican linguistic area, see van der Au- 
wera (1998b) and Gast (2007). 
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Fig. 13: Colexification of ‘bark’ and ‘skin’ in Mesoamerica in the ASJP data 


While Figure 13 suggests that colexification of ‘bark’ and ‘skin’ is actually much 
less widespread in Mesoamerica than we may have thought, it seems to us that 
what we find here in many cases is “loose colexification”. For example, in the 
Mixe-Zoquean language Copainalá Zoque (copa1236), naca means ‘skin’ and 
ku'yu-naca ‘bark’ or, literally, ‘tree-skin’ (Harrison, Harrison and Garcia H. 1981: 
280, 355). This example thus shows, once again, that a separate treatment of loose 
colexification will be beneficial to detect a broader range of colexification pat- 
terns. 


4.5 Clusters emerging from the CLICS-data 


Before we start to explore the data in a bottom-up way, it seems worthwhile to see 
if we can get any information from CLICS that would allow for a comparison with 
the ASJP data. As pointed out in Section 2.1, the types of concepts covered in 
CLICS are very different from those in ASJP and there is little overlap. One pair of 
elements, however, can be reasonably compared: CLICS provides information 
about ‘fire’ and ‘firewood’, which, in a way, complements the information on 
‘fire’ and ‘tree’ discussed in Section 4.4.1. Interestingly, the <‘fire’, ‘tree’> pattern 
is not found in CLICS at all. As can be seen in Figure 14 (showing hypothesized 
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cluster areas), the data is very sparse, however, and it is not surprising that the 
clusters — even though the colexification pattern shows a spatial autocorrelation 
according to the Join Count test — did not pass the significance test in the regres- 
sion analysis. 
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Fig. 14: Hypothesized cluster areas for <‘fire’, ‘firewood’> in the CLICS data 


While the <‘fire’, ‘firewood’> clusters in Southeast Asia and Australia/New 
Guinea are in accordance with the ASJP data on the colexification of ‘tree’ and 
‘fire’, shown in Figure 8, it is perhaps surprising that the CLICS data shows some 
relevant data points in South America, as this part of the world does not exhibit 
a single instance of colexification of ‘fire’ and ‘tree’ in the ASJP data. CLICS lists 
the following languages for this pattern: Qawasqar (qawa1238), Araona 
(arao1248), Wayuu (wayu1243), Kaingang (kain1272), E'fapa Woromaipu (also 
known as Panare, enap1235) and Yavitero (yavi1244). To be sure, ‘firewood’ and 
‘tree’ are different things but it still seems surprising that, according to the com- 
bined ASJP and CLICS data, not a single South American language seems to co- 
lexify both ‘fire’ and ‘firewood’ and ‘firewood’ and ‘tree’ — a pattern that is at- 
tested in 1096 of the Sahul sample in Schapper, San Roque and Hendery (2016). 
As we are not familiar with any of the South American languages involved, we 
have no further comments to make at this point. 

We can now turn to a data-driven inspection of the clusters emerging from 
the CLICS data. Our method identified 36 colexification patterns with a significant 
spatial autocorrelation and 37 cluster areas that were significant predictors for a 
given colexification (at a 0.05 level). There are twelve micro-clusters, thirteen 
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meso-clusters and twelve macro-clusters. The colexification patterns are listed in 
(2) and the thirteen significant meso-clusters are shown in Figure 15. Given the 
areal bias of the data, it comes as no surprise that most of the clusters are located 
in South America and Eurasia. 


(2 «air, ‘weather’>; «'beak', ‘mouth’>; <‘believe’, ‘think’>; <‘body’, ‘flesh’>; 
<‘brother’, ‘sister’>; <‘buy’, ‘take’>; <‘catch (ball) ‘, ‘take’>; <‘count’, ‘meas- 
ure’>; <‘count’, ‘think (be of the opinion)’>; <‘country’, ‘earth (ground, 
soil)’>; <‘daughter in law (of a woman)’, ‘mother in law (of a woman)’>; 
<‘daughter’, ‘son’>; <‘dig’, ‘drop (verb)’>; <‘dye’, ‘paint (noun)’>; <‘earth 
(ground, soil)’, ‘world’>; <‘father in law (of a man)”, ‘son in law (of a man)’>; 
<‘female (adjective)’, ‘woman’>; <‘female’, ‘woman’>; <‘furs’, ‘skin (hide)’>; 
<‘get (obtain)’, ‘take’>; <‘grandson’, ‘nephew’>; <‘grass’, ‘pasture’>; 
<‘grass’, ‘plant’>; <‘green’, ‘green (unripe)’>; <‘hold’, ‘keep (retain)’>; 
<‘hold’, ‘seize (grasp)’>; <‘hold’, ‘take’>; <‘language’, ‘voice’>; <‘leg’, 
‘thigh’>; <‘male (adjective)’, ‘man (vs. woman)’>; <‘male’, ‘man (vs. 
woman)’>; <‘nephew’, 'niece >; <'offspring (son or daughter)’, ‘son’>; <‘post 
(pole)’, ‘tree’>; <‘release (let go)’, ‘send’>; <‘time’, ‘weather’> 


Fig. 15: All meso-clusters emerging from the CLICS data 


Some of the clusters emerging from the data are not terribly interesting. For in- 
stance, there are many cases of adjectives and nouns being colexified, such as 
‘woman’ and ‘female’, which is a matter of morphosyntax rather than the lexicon. 
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The top ten of the remaining significant clusters are shown in Table 4 (the data 
for all clusters with a p-value lower than 0.1 is provided online).” 


Tab. 4: Significant lexical clusters from the CLICS data 


Colexification Size No Long Lat Radius No No Div pm p.pm 
lg fam 


«count, measure» macro 1 -68.0 -6.1 2,578 18 15 2.2 3.1 0.010 


<catch (ball), meso 1 -67.4 -8.6 2,181 16 13 2.1 1.85 0.048 
take> 

<count, measure> meso 1 -70.8 1.2 1,237 12 11 1.9 2.74 0.026 
<hold, take> macro 1 -62.0 -30.9 2,754 9 8 1.6 2.17 0.034 
<count, measure> micro 1 -75.4 -1.8 643 7 7 1.5 3.07 0.025 
<air, weather> macro 1 44.5 41.3 3,603 22 5 1.2 2.73 0.042 
<get (obtain), macro 1 48.3 46.3 3,088 24 5 1.2 2.38 0.039 
take> 

<hold (seize), micro 2 46.7 41.1 1,484 24 4 1.0 3.60 0.004 
grasp> 


<count (think) meso 1 46.2 41.9 1,557 18 4 1.0 2.98 0.011 
be of the opinion> 


<hold (seize), meso 2 46.7 41.1 1,484 24 4 1.0 3.93 0.004 
grasp> 


Table 4 can be split into two major groups of clusters, those from South America 
(the first five) and those from Eurasia (the last five). The five South American clus- 
ters comprise three colexification types: <‘count’, ‘measure’> (micro, meso and 
macro), <‘catch (ball)’, ‘take’> and <‘hold’, ‘take’>. The most prominent pair is 
clearly <‘count’, ‘measure’>. The macro-cluster listed at the top of Table 4 is 
shown in Figure 16. As the data in the table shows, it is extremely diverse genea- 
logically speaking, and is found in fifteen languages from eighteen families. 


28 See http://www.uni-jena.de/~mu65qev/data/colex-tables/CLICS-clusters.htm (accessed 5 
April 2018). 
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Fig. 16: Macro-clusters for <‘count’, ‘measure’> in the CLICS data 


The following clusters are all located in Eurasia: <‘air’, ‘weather’> (macro), <‘get 
(obtain), ‘take’> (macro), < ‘hold’, ‘seize (grasp)’> (micro and meso) and <‘count’, 
‘think (be of the opinion)’> (meso and macro). The “strongest” macro-cluster for 
<‘air’, ‘weather’> is shown in Figure 17. As the map shows, this colexification is 
divided rather categorically between South America and Eurasia. 


Fig. 17: Macro-clusters for <‘air’, ‘weather’> in the CLICS data 
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Further inspection of the data reveals some more interesting patterns, which we 
do not have the space to discuss at this point. We will therefore restrict ourselves 
to some comments on one group of clusters discussed above. It is interesting to 
see that CLICS brings to light a number of verbal patterns. The lexical typology of 
verbal meanings is particularly difficult, because comparability is even harder to 
establish in this domain than it is for nominal meanings, and elicitation repre- 
sents an additional challenge. We have to reckon with elicitation artefacts, for 
instance, stemming from the language used during the interviews or the lan- 
guage in which the concepts are coded. For example, the fact that colexification 
of ‘count’ and ‘think (be of the opinion)’ (see Table 4) is widespread in Eurasia 
may be related to the fact that the concepts are colexified in Russian as well. Schi- 
tat’ is actually listed as an elicitation form for both concepts in the IDS data 
(though, for ‘think’, an alternative form is given, i.e., dumat’). But then, it is pos- 
sible that Russian is just another language colexifying these concepts, perhaps 
under areal influence (note that the elicitation languages of the IDS data are not 
listed in CLICS). A more comprehensive study of such questions obviously re- 
quires more data from other parts of the world. 


5 Conclusions 


The aim of this study has been of an exploratory nature. We intended to deter- 
mine how much information about areal patterns of colexification we can gain 
from lexical databases such as CLICS and ASJP. We chose a bottom-up (rather 
than hypothesis-driven) approach, identifying areal patterns in three steps: (i) 
determine spatial autocorrelations in the data, (ii) identify clusters as candidates 
for convergence areas and (iii) test the clusters resulting from the second step 
controlling for genealogical relatedness. Moreover, we identified a (genealogical) 
diversity index for each cluster. For the ASJP data, we identified clusters associ- 
ated with four colexification pairs in this way: <‘fire’, ‘tree’>, <‘mountain’, 
‘stone’>, <‘ear’, leaf» and <‘bark’, ‘skin’>. One of these patterns has figured 
prominently in recent research carried out by specialists, i.e., <‘fire’, ‘tree’> (see 
Schapper, San Roque and Hendery 2016). Two of the colexification types have 
been discussed before, though not in very much detail, i.e., <‘mountain’, ‘stone’> 
and <‘bark’, ‘skin’> (see Urban 2012). The colexification of ‘ear’ and ‘leaf’, which 
seems to be prominent in Eastern Africa, has not been noted in the lexical-typo- 
logical literature, as far as we are aware (we are, of course, not familiar with the 
whole range of specialized literature). We regard these results as a proof of con- 
cept, in the sense that our bottom-up approach has yielded promising results. 
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Inspection of the other patterns emerging from the ASJP data shows a number of 
further interesting pairs, some of which may inspire more detailed research, such 
as <‘ear’, ‘name’>, <‘horn’, ‘tooth’> and <‘horn’, ‘knee’> (see the more comprehen- 
sive cluster lists in our online data repository, cf. footnote 25). 

The CLICS data shows a heavily biased areal distribution but it can be used 
to identify some differences between South America and Eurasia. Once interest- 
ing pairs of concepts have been identified, other regions of the world could be 
investigated more thoroughly. What makes the CLICS data particularly interest- 
ing is the inclusion of a broad range of concepts, including verbal ones. Given the 
relative scarcity of data and given the rather strict selection criteria that we ap- 
plied (e.g., a 0.05 level of significance), we only identified a relatively small num- 
ber of clusters. Still, some of these clusters are potentially interesting from an ar- 
eal point of view. Moreover, closer inspection of the data in the online repository 
(where all clusters with a p-value < 0.1 are listed, cf. footnote 29), again, brings to 
light some further relevant patterns, such as clusters for various kinship terms 
(e.g., <‘daughter’, ‘son’>, <‘brother’, ‘sister’>, <‘grandson’, ‘nephew’>) and body 
parts (e.g., <‘leg’, ‘thigh’>), as well as for the particularly interesting case of <‘lan- 
guage’, ‘voice’>. The conceptualization of ‘language’ seems to vary greatly across 
the regions of the world, including metonymies such as ‘tongue’, ‘voice’ and 
‘word’ (see Radden 2004). The ways in which the (rather abstract) concept ‘lan- 
guage’ is encoded deserves a typological study of its own. Here as well, inspira- 
tion for follow-up research was found in the data. 

We have also pointed out some drawbacks of the use of major lexical data- 
bases. First, the data has been collected from various sources, which means that 
they are not based on consistent definitions, and most of the primary data was 
probably elicited through English (or some other major language, such as Rus- 
sian), so that the elicitation stimulus functioned as a tertium comparationis. This 
implies the danger of elicitation artefacts of various types. As we saw in Section 
4.5, some languages of Eurasia, including Russian, colexify ‘count’ and ‘think (be 
of the opinion)’. It is possible that this is a genuine areal pattern but we cannot 
rule out that at least some data points were influenced by the (probably Russian) 
elicitation word. Note that the problems concerning the status of verbal meanings 
in cross-linguistic comparison are, of course, of a more general nature. Elicitation 
of verbal concepts represents a well-known problem for language documentation 
and linguistic typology, for at least two reasons. First, it is hard to know a priori 
what verbal concepts a language encodes, specifically when dealing with cul- 
tural practices that most of us are unfamiliar with (such as hunting). Second, ac- 
tions are mostly harder to describe or paraphrase than nominal concepts. Multi- 
modal (or even behavioral) elicitation techniques are therefore required (e.g., 
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videos). Elicitation is thus “expensive” and it is likely that, in general, lexicons 
from little described languages exhibit a “nominal bias” for this reason. It is 
therefore a great asset that CLICS contains a considerable number of verbal con- 
cepts — most of them originating from the IDS data — even though the data, obvi- 
ously, has to be handled with care. 

Having started the experiment of detecting areal colexification patterns bot- 
tom-up in an entirely unprejudiced way, our conclusion is predominantly posi- 
tive. Even in our small-scale exploratory study we have identified various topics 
that deserve closer investigation. In spite of the inevitable noise in data gathered 
on a large scale and in collaborative efforts, we hope to have shown that lexical 
databases represent a valuable tool for typological research, even when used for 
purposes that they were not originally intended for. 
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Martin Haspelmath 
How comparative concepts and descriptive 
linguistic categories are different 


Abstract: This paper reasserts the fundamental conceptual distinction between 
language-particular categories of individual languages, defined within particular 
systems, and comparative concepts at the cross-linguistic level, defined in sub- 
stantive terms. The paper argues that comparative concepts are also widely used 
in other sciences and that they are always distinct from social categories, of 
which linguistic categories are special instances. Some linguists (especially in the 
generative tradition) assume that linguistic categories are natural kinds (like bi- 
ological species or chemical elements) and thus need not be defined but can be 
recognized by their symptoms, which may be different in different languages. I 
also note that category-like comparative concepts are sometimes very similar to 
categories and that different languages may sometimes be described in a unitary 
commensurable mode, thus blurring (but not questioning) the distinction. Fi- 
nally, I note that cross-linguistic claims must be interpreted as being about the 
phenomena of languages, not about the incommensurable systems of languages. 


Keywords: comparative concept, descriptive linguistic category, social category, 
natural kind, type-token relation, (non-)portable term, (in)commensurability 


1 Introduction 


To make lasting progress in linguistics, we need cumulative research results and 
replicability of each other’s claims. Cumulativity and replicability are not much 
emphasized by linguists and one of the reasons why these seem difficult to 
achieve is that, often, we cannot even agree what we mean by our technical 
terms. Typically, this is because we do not distinguish clearly enough between 
descriptive categories of individual languages and comparative concepts for 
cross-linguistic studies. We routinely use the same terms for both (e.g., ergative, 
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relative clause, optative mood) but I have argued that we cannot equate the two 
kinds of concepts in the general case (Haspelmath 2010). 

The first published critique of my 2010 proposal was van der Auwera and Sa- 
hoo (2015) but, in the meantime, several further articles discussing this method- 
ological distinction have appeared (especially the papers collected by Plank 2016 
and Lehmann 2016). I will use the opportunity of this paper to address a number 
of different points that have come up in the discussion of the issues over the last 
few years. 

Overall, I have few disagreements with those linguists that work in a broadly 
Boasian and/or Greenbergian tradition. But it is clear that some of my claims 
seem controversial, so I hope that this paper will clarify a few issues. (I do have 
real disagreements with linguists who simply assume a close match between cat- 
egories of particular languages and innate cross-linguistic categories; see Sec- 
tions 6 and 7.) 

In this paper, I provide further justification for the claim in (1) but, in addi- 
tion, I put special emphasis on the observation in (2) that the general category 
presumption is wrong for linguistics. 


(1) Ontological difference 
Comparative concepts are a different kind of entity than descriptive catego- 
ries (cf. Section 5). 


(2) General category fallacy 
We do not learn anything about particular languages merely by observing 
that category A in language 1 is similar to category B in language 2 or by 
putting both into the same general category C (cf. Section 6). 


For example, by saying that the Spanish-specific construction estar V-ndo ‘be V- 
ing’ is an instance of the general category “progressive”, we do not learn anything 
that goes beyond what we need to know for a description of this construction an- 
yway. Thus, general categories do not by themselves advance our knowledge, alt- 
hough there are, of course, many ways in which information about some other 
language or knowledge of cross-linguistic patterns can help describers to identify 
all the properties of a language-particular construction.' 


1 And, of course, in comparative contexts, statements such as “estar V-ndo is a progressive con- 
struction” are very useful. Linguists make comparative statements all the time but the point here 
is that they are different in nature from language-particular statements. 
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This is worth emphasizing because there is a constant temptation to think 
that subsuming a language-particular descriptive category under a general cate- 
gory does add information. We experience the usefulness of the general category 
presumption every day: when a young woman introduces a young man as her 
boyfriend, I can make certain further inferences concerning their behavior, which 
are usually very helpful for further interaction; and when I am told that a certain 
kind of infusion is real tea (made from Camellia sinensis), I have different expec- 
tations concerning its effects than if it is a herbal tea made of chamomile. It is 
important to understand why the general category presumption is a fallacy in 
comparative linguistics. 

Briefly, the answer is that the cross-linguistic comparative concepts (like pro- 
gressive) are not natural kinds or pre-established categories that exist inde- 
pendently of the comparison. Different languages represent historical accidents 
and (unless they influenced each other via language contact or derive from a 
common ancestor) the categories of one language have no causal connection to 
the categories of another language. By contrast, the categories “boyfriend” and 
*Camellia sinensis" do exist independently of particular circumstances. And if 
someone becomes a boyfriend or if a new tea plant grows, this is causally con- 
nected to the independently existing category. 

I will elaborate on this point later on but, first, I discuss a number of different 
kinds of comparative concepts (Section 2). Subsequent sections will address a 
range of additionalissues that have come up in the literature on comparative con- 
cepts and descriptive categories. 


2 Kinds of comparative concepts 


Comparative concepts can be divided into two main types: category-like compar- 
ative concepts and etic comparative concepts. With the latter type, there is no 
danger of confusing them with pre-established categories. 

Category-like comparative concepts are the most difficult to deal with but 
also the most familiar type of comparative concept. Some examples of category- 
like comparative concepts are given in Table 1, listed together with chapters from 
the World atlas of language structures (WALS) that make use of them. 
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Tab. 1: Some category-like comparative concepts 


Category-like comparative concept WALS chapter 

lateral consonant Maddieson (2005a) 

syllable Maddieson (2005b) 

reduplication Rubino (2005) 

subject, object, verb Dryer (2005) 

independent personal pronoun Siewierska (2005) 

adnominal demonstrative Diessel (2005) 

future tense Dahl and Velupillai (2005) 
applicative construction Polinsky (2005) 

epistemic possibility van der Auwera and Ammann (2005) 


All these terms were originally used for the description of some particular lan- 
guage and were extended to comparative use only later (they could therefore be 
called “descriptive-derived terms”). Some of them are phonetically based (e.g., 
lateral consonant) or semantically based (e.g., epistemic possibility). But most 
category-like comparative concepts which are familiar from typology are hybrid 
comparative concepts (Croft 2016: 3), i.e., they include both semantic-functional 
aspects and formal aspects in their definition. For example, a future tense form is 
a verb form which includes a marker that indicates future time reference of the 
situation denoted by the verb. Crucially, the form must include a grammatical 
marker, i.e., a formally defined entitity,? and this marker must occur on a partic- 
ular class of roots (namely verb roots). In Haspelmath (2009: §6) and Haspelmath 
(2010: §5), Ilisted and defined a dozen category-like comparative concepts, which 
are all of this hybrid type. In these earlier papers, I focused on this subtype of 
comparative concepts, because these are the concepts that are often confused 
with descriptive categories. 

Another type of category-like comparative concept is known by terms that 
are not derived from grammars of particular languages. For the typology of argu- 
ment coding, the role types S, A, P, T and R, along with the notion of alignment, 
have proven very useful (Haspelmath 2011a) and, for the typology of subordina- 


2 A grammatical marker can be defined as a simple bound form (i.e., a form that cannot occur 
in isolation) but occurs in close association to a major class root (or in second position of the 
clause) and expresses an abstract meaning which may correspond to nothing in a translation to 
another language. 
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tion, Cristofaro (2003) makes extensive use of the notions of balanced subordina- 
tion and deranked subordination. These concepts have been important in typol- 
ogy but they are not normally used in descriptions and are therefore not easily 
confused with descriptive categories. Similarly, the general concepts of locus 
(head-marking and dependent-marking; Nichols 1992) and branching direction 
(Dryer 1992) have been important in typology but need not play any role in par- 
ticular languages. The notions of adpossessive construction (Haspelmath 2017) 
and existential construction (Creissels 2013) have also proven very useful, though 
many grammatical descriptions make no use of these notions. They are still cate- 
gory-like but less so than the descriptive-derived terms in Table 1. What is typical 
of these concepts is that they are defined more narrowly than the corresponding 
language-particular categories. For example, an adpossessive (i.e., an adnominal 
possessive) construction is defined as a construction that expresses kinship rela- 
tions, part-whole relations and/or ownership relations (cf. Koptjevskaja-Tamm 
2003) but, in individual languages, such constructions normally express other 
relations as well (e.g., my chair ‘the chair I am sitting on’, your school ‘the school 
that you are attending") 

In addition to category-like comparative concepts, typologists also work with 
etic comparative concepts, which are kinds of pronunciations in phonetic typol- 
ogy and meanings or functions in grammatical typology, often of a type that 
would not be expected to be the meaning or function of a single form. In semantic 
map studies, for example (e.g., Haspelmath 2003; van der Auwera and Temürcü 
2006), the nodes on the map are meanings or functions (or uses) that are em- 
ployed by the typologist to express generalizations across languages, as illus- 
trated by Figure 1. 


3 Thus, I disagree with Lander and Arkadiev's (2016: 404) statement that *if comparative con- 
cepts are not felt to be relevant for the grammars of different languages, they are usually not 
viable". On the contrary, many comparative concepts (e.g., all the etic ones) are not usable for 
language description and, conversely, some of the well-known category-like concepts that are 
not viable as comparative concepts (see example [8] in Section 8) work well in individual lan- 
guages. 
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participant-internal 


possibility i 


participant-external 


possibility 
| Tm condition 
deontic epistemic ; 
sp ir concession 
possibility possibility 


complementation 


Fig. 1: Modality's semantic map (van der Auwera and Plungian 1998: 91) 


Even though semantic map studies do not always make this fully clear, the mean- 
ings or functions (or uses) are not intended to correspond to any categories of 
languages. Categories of languages can be mapped onto semantic maps but there 
is no claim that the categories must be polysemous and that the meanings or uses 
on the map are somehow significant outside of the comparison. 

When the semantic-functional nodes on semantic maps are not abstract con- 
cepts as in Figure 1 but reflect concrete utterances, it is immediately clear that 
they are not linguistic categories but merely components of a comparative meth- 
odology. Examples of such token-based comparative concepts are visual stimuli, 
as employed in much recent research on semantic typology (e.g., Majid et al. 2007 
on cutting and breaking events: Evans et al. 2011 on reciprocals), as well as trans- 
lation contexts, as employed by questionnaire-based studies (e.g., van der Au- 
wera 1998a) and in parallel text typology (e.g., Walchli and Cysouw 2012; Dahl 
2014). Comparative concepts of the type considered in this paragraph are also 
called “etic grids” (Levinson et al. 2003: 487), using a term originating in anthro- 
pology.^ The functions or uses of classical semantic maps of the type in Figure 1 
have not been called “etic” but I would argue that their status is not any different. 
As Croft (2016: 3) notes, the newer token-based methods “provide a denser distri- 
bution of comparative concepts in particular regions of conceptual space" and 
the existing cross-linguistic studies have shown that “linguistic categorization is 
even more variable than we believed". 

What all comparative concepts share is that they are defined in substantive 
terms, i.e., making reference to aspects of form or meaning that are independent 
of the structures of particular languages. This allows them to be applied to all 


4 The terms “etic” and “emic” from American anthropology (going back to Kenneth Pike) 
broadly correspond to the Hjelmslevian (European structuralist) terms “substance-based” and 
*structure-based" (cf. Boye and Harder 2013). 


How comparative concepts and descriptive linguistic categories are different —— 89 


languages in the same way, using the same criteria for all languages. This point 
will become important in Section 7. 

Different kinds of comparative concepts relate to language-particular phe- 
nomena in somewhat different ways. Token-based comparative concepts must be 
matched by tokens of language use and category-like comparative concepts (like 
those in Table 1) are generally matched by categories of language systems. Cate- 
gory-like comparative concepts are particularly easy to confuse with descriptive 
categories because we talk about *a language having X" in both cases. As a lan- 
guage-particular statement, for instance, we say that “German has a Future tense 
construction, formed with the auxiliary werden" and, likewise, we say from a ty- 
pological perspective that “German has a periphrastic future tense construction". 
These two ways of expression sound almost identical but they are actually quite 
different. From a comparative perspective, German could have a periphrastic fu- 
ture tense construction that is at the same time an epistemic mood construction. 
But German's Future tense construction cannot be anything else at the same time 
— itis just a single language-particular construction, identified by language-par- 
ticular criteria. 


3 Natural kinds, social categories and observer- 
made concepts 


Describing a new language is somewhat like discovering a new island that has 
not been visited by an explorer before. The language contains a large number of 
previously unseen elements of language structure: more concrete ones such as 
sounds and words and more abstract ones such as classes of sounds, meanings 
and sound-meaning combinations at multiple levels of organization. These can 
be compared to landscape features of the newly discovered island and to the 
plant and animal species inhabiting the island. The explorer will try to bring 
home pictures of the island's mountains and streams, as well as behavioral de- 
scriptions and specimens of the plants and animals, and, in modern times, she 
will also make videos that tell others about the new discoveries. Likewise, the 
descriptive linguist will make sound recordings of the language and bring home 
a dictionary and a grammar containing many new "linguistic species". 

When multiple islands are compared by comparative geographers and bioge- 
ographers, they must find a way of relating all the unique parts and life forms of 
the islands to each other. Now crucially, this is done differently for plants, ani- 
mals and minerals than for mountains and streams. 
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Plant and animal species elements and kinds of minerals are natural kinds, 
i.e., they are categories which “have properties that seem to be independent of 
our minds" (Dahl 2016: 428). For example, the red fox (Vulpes vulpes) is a category 
of animals that form a group regardless of any observers. To talk about them, we 
need detailed descriptions and agreement on a label but not a definition. If we 
know enough about red foxes, we can easily recognize them in California or China 
after having first described the species in Europe (or vice versa). The same is true 
for trees such as the sycamore (Acer pseudoplatanus), found in Spain, Belgium 
and Romania, and for elements and minerals such as gold or quartz.’ (Philoso- 
phers seem to regard chemical elements as the best exemplars of natural kinds 
but, for present purposes, biological species can also be included.) 

Mountains and streams, by contrast, are not categories of nature. They are 
concepts created by observers and we must learn what they mean from other peo- 
ple. If they are to be applied in science, they must be defined rigorously and de- 
limited from similar phenomena (e.g., mountains versus hills, streams versus riv- 
ers). They are comparative concepts of physical geography. Such delimitations 
are often somewhat arbitrary, so terminological uniformity among scholars may 
require decisions by nomenclature bodies (a well-known example is the Interna- 
tional Astronomical Union's 2006 decision to define the comparative concept of 
a planet in such a way that Pluto is no longer considered a planet). 

When exploring a new island, researchers may find completely new plants 
and animals (endemic to the island) but they will not find completely new land- 
scape forms to which existing terms (like “mountain” or “stream”) are inapplica- 
ble. Geographers may feel unhappy with conventional terminology and may pro- 
pose new ways of cutting up the continuum found in nature (just as astronomers 
changed their minds about planets). But such changes in observer-made con- 
cepts will not be triggered by any single discovery, the way a single new animal 
species requires a new name. 

But what about human cultures? Suppose the explorers encounter a new hu- 
man population, with different kinship patterns, poetic forms and house-build- 
ing styles than they are familiar with. How will these be categorized? On the one 


5 Another sort of natural kind is represented by diseases such as tuberculosis, which can occur 
in different places at different times and which can be cured in the same way, regardless of cul- 
tural conventions (cf. Haspelmath 2015 on the analogy between linguistic categories and dis- 
eases). Such diseases are generally caused by a single pathogen. (Of course, there are also dis- 
ease names that comprise rather heterogeneous conditions and these are then better seen as 
comparative concepts, such as the *common cold".) 
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hand, comparative culture scientists work with observer-made concepts. For ex- 
ample, when Botero et al. (2014: 16784) find that “beliefs in moralizing high gods 
are more likely in politically complex societies that recognize rights to movable 
property”, they use the observer-made concepts “moralizing high god" and “po- 
litically complex society”, which have a status very much like that of “mountain” 
or *planet". These are thus comparative concepts, not natural kinds. 

On the other hand, human cultures and societies also have specific catego- 
ries that are neither natural kinds (in the sense that they recur across continents, 
independently of individual cultures) nor observer-made concepts but that are 
recognized by every member of the society. For example, Western societies have 
the categories "boyfriend" (a quasi-kinship concept), *poetry slam" (a poetic 
form) and “office tower" (a house-building style). These are not universal and did 
not exist in Western societies as recently as 150 years ago but, nowadays, they are 
well-recognized parts of Western culture. I call such categories social categories. 
What they share with natural kinds is that they are pre-established and there is a 
causal connection between their members and the category. It is not only observ- 
ers of the Hong Kong skyline that put the buildings in the category “office tower" 
— these buildings were created with precisely this category in mind. Similarly, 
when a man becomes a woman's boyfriend, he knows in advance what social be- 
havior this category implies. 

Moving to language, many readers will readily agree that comparative con- 
cepts used in language typology are observer-made in the same sense as *moun- 
tain” or “politically complex society". But what about the descriptive categories 
that authors of grammars of individual languages set up for their descriptions? 
Are they not more like the unique plant and animal species that explorers used 
to find on newly discovered islands? And what about individual words or mor- 
phemes, such as the word bahi ‘book’ in Odia (an Indic language of India)? Here, 
I will argue that language-particular categories are social categories, not natural 
kinds or observer-made concepts (see Section 6). But before we get there, I will 
discuss the main challenges of language description and comparison (Section 4) 
and why there is no type-token relation between comparative concepts and de- 
scriptive categories (Section 5). 


4 The challenges of description and comparison 


Linguists often talk about “theoretical approaches" and “linguistic analysis" but 
I do not find these notions sufficiently clear. It seems to me that all non-applied 


92 —— Martin Haspelmath 


linguistics is theoretical and that analysis is the same as description (Section 4.1). 
Deeper questions often require comparison of languages (Section 4.2). 


4.1 Description 


Science begins with charting the territory and cataloguing the phenomena, as a 
prerequisite for comparing the data to answer deeper questions. A basic differ- 
ence between the two is that charting should be exhaustive while asking and an- 
swering deeper questions is an endless enterprise. 

In practice, it may be difficult to describe a language fully but this is a task 
that can in principle be completed. We do have very comprehensive dictionaries 
of quite a few languages and the complexity of grammars is not limitless either. 
Thus, one goal of linguistics is to describe all languages in such a way that every 
regularity is captured or, in other words, to chart the territory exhaustively. This 
is quite different from the comparison of languages, which is necessarily partial, 
as further discussed in Section 4.2. 

In addition to listing the words of a language, our descriptions need to make 
reference to categories (with names such as syllable, construction, inflection 
class, noun phrase and clause) because language use is productive and speakers 
can create and understand completely novel complex expressions. These catego- 
ries must strike a balance between elegance and comprehensibility. The more ab- 
stract the description, the less easy it will be to understand it, because it will pre- 
suppose understanding many abstract intermediate concepts. Thus, there is no 
such thing as the best description’ but description can be more or less compre- 
hensive and ideally, it would be exhaustive. Van der Auwera and Sahoo (2015: 2) 
are right when they observe that not only comparative concepts but also descrip- 
tive categories are *made by linguists" but the difference is that linguistic cate- 
gories must exist for productive language use to be possible, independently of 
linguists. Different speakers may use different categories, just as different lin- 
guists may prefer different categories, but categories of some kind must exist. (In 


6 For example, Müller (2004) says that the Russian nominal inflectional suffix -o can be charac- 
terized by the features {[+N],[+a,+ß],[-obl]}. This is an elegant description because it requires 
only four features. But it is very hard to understand because readers need to have an explanation 
of the highly abstract features and their values first. 

7 It is often said that descriptions should be cognitively realistic (reaching “descriptive ade- 
quacy" in Chomsky's parlance) but it has never been made plausible that any existing descrip- 
tions even approach this goal, so it is unclear to me how seriously it can be taken. 
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contrast, comparative concepts do not exist in the absence of comparative lin- 
guists.) 

It is also sometimes said that descriptions should be “typologically in- 
formed" (e.g., Himmelmann 2016) but it is unclear what exactly this means, be- 
yond the imperative to avoid idiosyncratic terminology. What is clear, however, 
is that one cannot describe a language well by filling in a questionnaire or check- 
list. The grammars based on the Comrie and Smith (1977) questionnaire are often 
hard to understand because they do not give the authors the opportunity to in- 
troduce the basic categories that are crucial for understanding the grammatical 
patterns of the language. It is true that the checklist structure ensures compre- 
hensiveness and comparability but it does not ensure or even allow good descrip- 
tions. 


4.2 Comparison 


Unlike description of languages, comparison is not a goal in itself. It always 
serves some other goal, such as learning about human language in general or 
answering questions about the historical origin and development of languages. 

Comparison must be based on comparable phenomena, i.e., phenomena that are 
identified by the same criteria in all languages (sometimes called tertia compara- 
tionis). It is not sufficient if the phenomena happen to have the same label in dif- 
ferent languages. This is the same in other disciplines, such as geography. We 
can compare streets, bridges and subway lines across cities on the basis of their 
universally applicable formal and functional properties and probably also main 
streets and side streets, as well as one-way streets and city highways. But it makes 
no sense to compare streets called *Willy-Brandt-Strafje" across German cities 
(unless one's focus is on the history of street naming, of course). Thus, we can 
compare gender systems or causatives across languages only if we have a univer- 
sally applicable definition of the comparative concepts of gender and causatives. 
One of the most interesting results of comparison is implicational universals of 
the type pioneered by Greenberg (1963). In order to formulate testable universals 


8 Van der Auwera and Sahoo (2015: 139) say that each language should *be described in its own 
terms, but that does not mean that one should start from ‘categorial’ or ‘conceptual’ scratch each 
time one sets out to describe a new language". But since each language has its own sets of con- 
ventions and linguistic categories are defined within the language system (as will be seen in 
Section 5), strictly speaking, one has to start from scratch, although, in practice, substantive 
characterizations of categories will often serve as a good starting point for further detailed work 
(see Section 8). 


94 —— Martin Haspelmath 


which can be replicated and can serve as the basis for a cumulative research 
agenda, it is particularly important that the comparative concepts have clear 
boundaries. Canonical definitions are useful in that they allow us to see how var- 
ious phenomena relate to each other conceptually (cf. Brown, Chumakina and 
Corbett 2013) but they do not allow us to test universal (or other quantitative) 
claims, because they do not have clear boundaries.? 

Unlike description, comparison cannot and need not be exhaustive. There 
are many things that can usefully be compared across languages but each lan- 
guage also has highly idiosyncratic features that cannot be readily compared. Ex- 
amples from grammar are stranded prepositions in English, strong and weak ad- 
jectives in German, liaison in French and A-not-A questions in Chinese. Linguists 
tend to study more general phenomena and they rarely wonder about idiosyncra- 
sies of lexical items and idiomatic multi-word expressions, of which every lan- 
guage has many thousands. All these can (and ultimately must) be described but 
they can hardly be compared across languages. This is not a problem, because 
there may not be anything special to learn about such historically accidental phe- 
nomena anyway, beyond their exhaustive description. 


5 Why there is no type-token relation between 
comparative concepts and descriptive 
categories 


According to Lehmann (2016) and Moravcsik (2016), comparative concepts can 
simply be seen as types of which descriptive categories are tokens: *Comparative 
concepts are taxonomically superordinate to descriptive categories." (Moravcsik 
2016: 422). 

In some simple cases, this may seem to be the case. Thus, Moravcsik would 
say that English personal pronouns and Hungarian personal pronouns are tokens 
of the general category “personal pronoun” and Lehmann (2016: 82.3) says that 
the Ancient Greek dual is a hyponym of the general (“interlingual”) category 
*dual". And in these particular cases, no big problems would arise. 


9 The same is true of prototypical concepts (cf. Lehmann 2016: 82.2.2) or “vague“ comparative 
concepts (Lander and Arkadiev 2016: 83). With Dryer (2016: 317), I tend to think that the tempta- 
tion to set up concepts with non-clear boundaries in typology arises from the failure to distin- 
guish between comparative concepts and language-particular categories. 
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However, more generally, this is not the case, because descriptive categories are 
defined in a very different way from comparative concepts: “Language-specific 
categories are classes of words, morphemes, or larger grammatical units that are 
defined distributionally, that is, by their occurrence in roles in constructions of 
the language.” (Croft 2016: 7).'? Comparative concepts, by contrast, are defined in 
a way that is independent of distributions within particular systems. This is a cru- 
cial point that is often overlooked. 

For example, Moravcsik (2016: 420) says that one could ask whether the cat- 
egories of the Latin case system (Nominative, Accusative, etc.) hold for Warlpiri 
and that it is an empirical question whether the two are commensurable or not. 
And van der Auwera and Sahoo (2015: 3) say that three categories A, B, C from 
three different languages could simply be compared by checking whether they 
share the features a, b, c, d, and so on. But this approach cannot work, because 
categories are defined within particular systems, which are different across lan- 
guages. It makes no sense to ask whether Warlpiri has a Latin Accusative because 
the Latin Accusative is defined with respect to constructions of Latin. And when 
van der Auwera and Sahoo (2015) compare demonstratives of a special type in 
English, Dutch and Odia (such, zulk and emiti/semiti), they do not do so with re- 
spect to the defining features of these items but with respect to other comparative 
concepts which actually play no role in defining these items." 

That comparative concepts are different kinds of entities than descriptive cat- 
egories is clearest in the case of etic comparative concepts, especially token- 
based concepts like visual stimuli and translation contexts. But category-like 
comparative concepts are not different in principle. The category-like compara- 
tive concept “dative” (Haspelmath 2009: 86.1) is defined in the familiar substan- 
tive way based on universally applicable semantic and formal features? but the 
meaning of the English preposition to is defined with respect to the structural 
network of constructional meanings in English. Many authors attribute a general 
*goal" meaning to it and claim that a sentence such as Mary gave the money to 
John uses the Caused-motion construction and thus has a slightly different mean- 
ing than Mary gave John the money, which uses the Ditransitive construction (e.g., 
Goldberg 1992). From a comparative perspective, one can thus say that English to 


10 To this, I would add that phonemes and other phonological categories, as well as language- 
specific meanings, have the same status. 

11 In fact, there is no need to define English such, other than by its pronunciation, as van der 
Auwera and Sahoo (2015: 83.7) note themselves. 

12 A dative marker is a marker on a nominal that codes the recipient role if this is coded differ- 
ently from the theme role (Haspelmath 2009: 86.1). 
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matches the “dative” concept but one cannot say that it is a token of a general 
(cross-linguistic) dative category or that it *instantiates" the general category.” 

That the difference is important can best be seen by controversial cases, such 
as the notion of subject, which has been widely discussed (also in Dryer's seminal 
1997 article). From a comparative perspective, it seems best to use the term *sub- 
ject” as the conjunction of the S argument (the single argument of a verb like 
‘fall’) and the A argument (the agent argument of a verb like ‘kill’, cf. Dixon 1994: 
124) because, in this way, we can ensure the biggest overlap with the existing 
literature. However, in particular languages, definitions of syntactic roles are 
necessarily rather different. They do not make any reference to S, A and P but 
rather to constructions such as case-marking, person indexing and passivization. 
In Latin and German, for example, one could say that a Subject is a nominal ar- 
gument that is in the Nominative case and controls Verb Agreement. Subjects can 
have various kinds of semantic roles (going far beyond physical action verbs like 
‘kill’, which are the basis of the definition of A and P, as well as transitive clauses; 
Haspelmath 20113), but these do not define the category. The category is defined 
by case and agreement. 

The situation in English is different, because case is impoverished and vari- 
ous syntactic patterns are quite salient. For example, Subject-to-Object Raising 
not only allows patterns such as (3), but also patterns like (4), where the existen- 
tial particle there is raised. 


(3) a. The dog is in the house. 
b. I believe the dog to be in the house. 


(4) a. There are two unicorns in the garden. 


b. I believe there to be two unicorns in the garden. 


This is commonly taken to be a criterion for Subjecthood in English, for good rea- 
sons. If we do not use the label *Subject" for the dog and there in (3) and (4), we 
need to find some other label and none comes readily to mind. But this also 


13 Dahl (2016: 429) objects to my earlier arguments against a type-token relation, observing cor- 
rectly that the mere fact that a category in a language has more properties than the comparative 
concept does not mean that there can be no type-token relationship (see also Lehmann 2016: 
82.3). In Haspelmath (2010), I did not sufficiently emphasize that categories are defined distribu- 
tionally within a given language while comparative concepts are defined not distributionally but 
by their substantive properties. 
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means that agreement is no longer relevant to the definition of Subject in English, 
because the verb are in (4a) does not agree with there. In Icelandic, which has 
much richer case marking, not even case is thought to be relevant for the defini- 
tion of Subject. 

This well-known example nicely illustrates that, in different languages, dif- 
ferent criteria are used to identify categories that are rather similar semantically 
(because, of course, the Latin, English and Icelandic Subject categories are se- 
mantically similar and differ only in atypical cases). But since the categories are 
not defined by their meanings, their nature is different and they are incommen- 
surable. 

In such cases of incommensurable definitions, it is nonsensical to use the 
term "subject" as a general term and to ask, for example, whether the Subject is 
the controller of reflexivization in both Latin and Icelandic. There is no Subject 
concept that would work as a descriptive category in diverse languages. 

Thus, I maintain the view that comparative concepts and descriptive catego- 
ries are not the same kinds of things. But even more important is the point that 
we do not learn anything about language 1 by observing that its category A is 
similar to category B in language 2 or by putting both into the same general cate- 
gory C: the general category presumption does not work in cross-linguistic stud- 
ies. This is discussed next. 


6 Linguistic categories are not natural kinds but 
social categories 


When I realize that the Spanish noun nariz ‘nose’ belongs to the Feminine gender, 
this gives me additional knowledge about this noun: I can predict that it will oc- 
cur with the indefinite article form una (not un). And when you are told that the 
Russian verb kupit’ ‘buy’ is in the Perfective aspect, you can predict that its Non- 
Past form will have future time reference (ja kuplju ‘I will buy’). Thus, language- 
particular categories help predict the behavior of linguistic forms. In this regard, 
they are like natural kinds or (other) social categories. As we saw in Sections 1 
and 3, when told that something can be subsumed under a natural kind or a social 
category, we learn more: when told that a drink is made of Camellia sinensis, we 
can predict its health effects and, when told that a man is a woman's boyfriend, 
we can predict their behavior. Similarly, once we realize that an animal is a red 
fox (Vulpes vulpes), we can predict much about it and, if an investor is told that a 
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developer wants to build an office tower, they have clear expectations. Both nat- 
ural kinds (like tea, red fox and sycamore) and social categories (like boyfriend, 
office tower and epic poem) are categories that exist in advance, independently 
of the categorization. Realizing that something is subsumed under a natural kind 
or social category is a finding that gives us additional information and we can 
establish a causal link between the phenomena and the categories. 

In this respect, natural kinds and social categories are crucially different from 
comparative concepts such as *mountain", *planet" or *moralizing high god". If 
a geographer calls a landscape form on a newly discovered island a mountain, 
this does not add any information and it does not establish a causallink. And the 
classification by a category-like concept such as mountain may be regarded as 
too crude by other observers, to be replaced by more fine-grained comparative 
concepts such as precise contour lines on topographic maps (just as rough clas- 
sifications into alignment patterns based on S, A and P can be replaced by more 
fine-grained comparative concepts based on micro-roles; e.g., Hartmann, Haspel- 
math and Cysouw 2014). Similarly, comparative concepts in economy such as 
“developing country” and “industrialized country” are very crude and are usu- 
ally replaced by more fine-grained measurements. 

But are categories of particular languages natural kinds or social categories? 
This depends on whether one sees language systems as biological entities or as 
conventional systems. 

In generative grammar, it is common practice to emphasize the biological 
foundations of language and it is often assumed that highly specific aspects of 
language are part of its biology, including not only architectural properties of the 
system but also substantive features ("substantive universals").^ In this ap- 
proach, linguistic categories are thus regarded as natural kinds, which means 
that the same categories are used in different languages, just as different lan- 
guages use the same architectural design for their rules. In other words, catego- 
ries are thought to be cross-linguistic categories (or universally available catego- 
ries; Newmeyer 2007). This means that there is no need to define linguistic 
categories, just as there is no need to define natural kinds such as red fox, gold 
or tuberculosis (Zwicky 1985: 284—286). Natural kinds can be recognized by vari- 
ous symptoms, which need not be necessary and jointly sufficient, unlike defini- 
tional criteria (cf. Haspelmath 2015). 


14 "Substantive universals ... concern the vocabulary for the description of language; formal 
universals involve rather the character of the rules that appear in grammars and the ways in 
which they can be interconnected" (Chomsky 1965: 29). 
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I regard the generative vision as perfectly coherent? but it has not been con- 
firmed by research on grammatical patterns over the last century. We have not 
come up with a fixed list of categories (analogous to the periodic table of elements 
in chemistry; cf. Baker 2001) that we encounter again and again with exactly the 
same properties. 

In practice, when we describe a new language and find a phenomenon that 
is similar to a previously encountered phenomenon from some other language, 
this is far from the end of our study: we still need to look at the whole range of its 
properties. For example, when we discover a construction that has some proper- 
ties of a passive construction, we cannot simply say that it belongs to the natural 
kind “passive” and leave it at that. We need to investigate it in detail, until we 
have found all its properties in all contexts (e.g., Noonan 1994 on two different 
passives in Irish; Broadwell and Duncan 2002 on two passives in Kaqchikel). In 
the end, it does not matter what we call the newly found category — we should 
probably call it “Passive” for pedagogical reasons but, by attaching that label to 
the category, we have not learned anything that is not part of our primary de- 
scription. Thus, I do not see any reason to hope that we will ever find a fixed list 
of possible categories and it remains a remote possibility at best." 

Languages have a strong biological basis but they vary widely across com- 
munities, i.e., they are systems of social conventions, like social hierarchies, re- 
ligions, laws, currencies and kinship systems. All of these consist of social cate- 
gories. In general, social categories are definable only within particular systems. 
Thus, the religious category “angel” can be defined only within a monotheistic 
religion of the Judeo-Christian-Islamic type; the kinship-like category “boy- 
friend" can be defined only within a modern Western society; the currency Euro's 
validity depends on the existence of European Union institutions; and so on. All 
social categories need to be described fully within their frame of reference and 
we do not learn anything new by linking them to a comparative concept. For ex- 
ample, if a religious scholar encounters an angel-like being in a newly studied 
faith, they cannot simply assume that it has all the properties of angels in Chris- 
tianity or Islam. And if a Western comparative legal scholar encounters a divorce 


15 Dryer (2016: 314) sees it in the same way: "The position that there are crosslinguistic catego- 
ries is, under such a view [i.e., of innate linguistic knowledge], at least coherent ... this is the only 
coherent way in which there might be cross-linguistic categories." 

16 PHOIBLE (Moran, McCloy and Wright 2014) contains segment inventories of 1,672 languages 
and it makes use of 2,160 comparative concepts for segment types. If more languages are added, 
no doubt more and more segment types would have to be included. Many segment types recur 
across languages but there is no reason to think that there is a biological limit on segment types. 
The same is apparently true of other types of categories. 
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law in a non-Western society, they cannot simply assume that it has all the prop- 
erties of Western divorce laws (which are, of course, somewhat variable them- 
selves). 

The three kinds of scientific concepts that I have discussed here and how they 
relate to concepts in other disciplines are summarized in Table 2. 


Tab. 2: Social categories, natural kinds and comparative concepts 


Discipline Social category Natural kind Comparative concept 
Independently existing category Observer-made concept 
Culture-specific Universally applicable 

linguistics Spanish Feminine noun, ergative alignment, epis- 
Russian Perfective verb temic possibility 

religious studies Christian angel, Jewish moralizing high god 
Rabbi 

chemistry gold, quartz catalyst 

medicine tuberculosis respiratory disease 

biology Camellia sinensis, predator, wing 


Vulpes vulpes 


astronomy planet 
geography office tower mountain, stream 
sociology boyfriend father, mother, ego 


Thus, linguistic categories are not independently existing natural kinds and 
there is no way around a complete description of phenomena of individual lan- 
guages. The question then arises what the status of category assignment contro- 
versies (Haspelmath 2007) is, i.e., for instance, why we would want to know 
whether Chamorro words with meanings like ‘big’ are “Class II words" (words 
with weak pronoun subjects; Topping 1973) or whether they are "adjectives" 
(Chung 2012). Both descriptions are possible, though the first one would seem to 
be more straightforward (as it makes reference to a highly salient feature whereas 
the second description builds on two fairly marginal phenomena). So why would 
one insist that a description in terms of adjectives is possible and desirable (as 
Chung 2012 does)? The only reason, it seems, is that it would confirm the hypoth- 
esis that all languages have nouns, verbs and adjectives as innate categories, i.e., 
that these are natural kinds. But this hypothesis seems to be based primarily on 
English and the alternative hypothesis that all languages are like Chamorro in 
having Class I and Class II words would also be confirmed by many (and maybe 
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all) languages (Haspelmath 2012).7 And if Chung's (2012) deeper study of 
Chamorro had indeed made a discovery of broader significance, we would expect 
that other properties of the relevant Chamorro words would come to light due to 
their identification as adjectives. But this is not the case: the properties of 
Chamorro adjectives are specific properties of Chamorro, not general properties 
of adjectives in all languages. Calling them adjectives does not teach us anything 
further about Chamorro (or about human language) and thinking that it does 
means succumbing to the general category fallacy in (2). 


7 Different criteria for different languages 


Unfortunately, the general category fallacy is still widespread in linguistics. 
When there is a prominent grammatical term, linguists often assume that it 
stands for a general category that exists independently of the term and of partic- 
ular languages. Since languages differ in the criteria that can be used, linguists 
resort to different criteria for different languages. It is often implicitly assumed 
that this is an acceptable strategy and, sometimes, it is also stated explicitly, as 
in (5). 


(5) a. adjective 

Dixon (2004: 9): “All languages have a distinguishable adjective class ... 
[which] differs from noun and verb classes in varying ways in different 
languages, which can make it a more difficult class to recognize." 

b. word 
Spencer (2006: 129): “There may be clear criteria for wordhood in indi- 
vidual languages, but we have no clear-cut set of criteria that can be ap- 
plied to the totality of the world's languages." 

c. monoclausal pattern 
Butt (2010: 57): “Whether a given structure is monoclausal or not can 
only be determined on the basis of language-dependent tests. That is to 
say, tests for monoclausality may vary across languages, depending on 
the internal structure and organisation of the language in question." 


17 Ofcourse, such a hypothesis could only be formulated after turning Class I and Class II into 
comparative concepts (or by assuming that they are innate categories of UG), just as the Latin- 
specific category Adjective has been turned into a comparative concept. 
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d. noun phrase versus prepositional phrase 


Baker (2015: 13): “[To distinguish NPs and PPs, we should] hope that one 
can find some fine-grained syntactic properties which distinguish the 
two kinds ... : a process of clefting, perhaps, or quantifier floating — the 
sorts of syntactic phenomena known to apply to NPs but not to PPs in 
some languages.” 
However, using different criteria (or “tests”, “properties” or “diagnostics”) for 
different languages makes sense only if we have good reason to think that the 
phenomenon exists as a universal category (or natural kind) in the first place. In 
generative linguistics, the presupposition that part of our grammatical 
knowledge is innate makes it at least a coherent enterprise to look for such uni- 
versal categories but, if there are no good initial reasons to think that categories 
like “word” or “prepositional phrase” are universal (other than that they have 
been used in the grammatical tradition of the last few decades and centuries), it 
is not a promising enterprise. Croft (2009, 2010) has called this approach “meth- 
odological opportunism”. Another term that I have used informally is “diagnos- 
tic-fishing". 

It seems to me that diagnostic-fishing is one of the biggest obstacles to rigor- 
ous cross-linguistic comparison and to the sort of replicable and cumulative sci- 
ence of language structures that I mentioned at the beginning of this paper. It is 
for this reason that I regard the distinction between language-specific descriptive 
categories and rigorously defined comparative concepts as fundamental for the 
progress of typological linguistics. 


8 Portable terms for category-like comparative 
concepts 


Some category-like comparative concepts seem very similar to corresponding de- 
scriptive categories. For example, the Italian Future tense and the Swahili Future 
tense are similar to each other (in the sense that their language-particular de- 
scriptions would involve very similar basic notions) and one could say not only 
that they correspond to the comparative concept "future tense" of Dahl and Ve- 
lupillai (2005) but even that “the Italian Future tense is a future tense", i.e., that 
there is a type-token relationship here or an instantiation relationship. And for 
languages which have two such categories, like English, one could say that *both 
the will Future and the gonna Future instantiate the future tense". Thus, for these 
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concepts, it is possible to see the comparative concepts as categories or classes. 
The comparative concept "future tense" would then be the class (or category) of 
all tense forms in different languages that fulfill the definition. 

Terms for comparative concepts of this kind are called “portable” by Beck 
(2016) and there are quite a few of them, such as those in (6). 


(6) personal pronoun, second person, demonstrative, polar question, accusa- 
tive, instrumental, comitative, future tense, past tense, dual, plural, cardi- 
nal numeral, conditional clause, bilabial, velar, fricative, nasal stop 


I do not agree with Beck (2016: 395) that these are language-particular terms 
which “are comparative concepts"? but, clearly, these terms are widely used for 
category-like comparative concepts which do not differ greatly in their definition 
from the corresponding descriptive categories. In many or most circumstances, it 
does not matter much for these concepts whether they are defined substantively 
like comparative concepts or distributionally like language-particular categories. 
It seems that those linguists who deny or ignore the importance of the distinction 
between comparative concepts and descriptive categories mostly have this sub- 
set of comparative concepts in mind. 

However, even here, it is often necessary to distinguish between descriptive 
categories and comparative concepts when one considers the phenomena in 
greater detail. For example, the German polite pronoun Sie ‘you’ is semantically 
a second person pronoun but, within the grammar of German, it is a Third Person 
form that triggers Third Person indexing on the verb, (e.g., -en in Sie kommen *you 
are coming’). The English polite question would you please open the door? is a 
Polar Question within in the grammar of English (as can be seen from its word 
order and intonation pattern) but, functionally, as a speech act, it is not a ques- 
tion but a request. The Finnish Present tense is normally used in future contexts 
where English requires a special future tense form (Dahl and Velupillai 2005) but 
it would still be strange to say that “the Finnish Present tense instantiates the 
future tense".?? 


18 But perhaps Beck (2016) means this statement as a description of the historical process, in 
which case I agree. Clearly, these terms originated as descriptions of language-particular cate- 
gories which were transferred to other similar languages without much confusion arising (as 
noted in Section 2). The resulting comparative concepts are different (see below) but the differ- 
ence is not striking and may not be noticed much in practice. 

19 Lehmann (2016: 82.1) says that grammatical category concepts can be multiple hyponyms of 
other grammatical category concepts but it seems that this is possible only when these are on 
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How does one distinguish between portable and non-portable category la- 
bels? I do not know any simple answer to this question. Most grammatical cate- 
gory terms from the Greco-Latin tradition have been used for other languages but 
not all of them have given rise to general concepts that can be defined in the same 
way (using substantive concepts) for all languages. Some concepts that do not 
seem to work for all languages are listed in (7). 


(7) a. aorist, supine, gerund, middle voice, ablative absolute 
b. word, clitic, adposition, compound, incorporation, morphology 
c. inflection, derivation 


d. finite, converb 


The terms in (7a) belong to the more exotic aspects of the classical languages and 
only “middle voice" has been used in a typological context, as far as I am aware 
(but while Kemmer [1993] cites many similarities in different languages, she does 
not provide a definition of middle voice with clear boundaries). The unsolved 
problems with “word” and 'clitic” as comparative concepts are discussed in 
Haspelmath (2011b, 2015) and they carry over to other concepts defined in terms 
of “word”, such as adposition, compound and morphology. Sharp boundaries 
between inflection and derivation are often assumed (e.g., when gender is de- 
fined in terms of a lexeme concept, which is itself defined in terms of the inflec- 
tion concept) but they do not seem to be definable in a cross-linguistically appli- 
cable way (cf. Plank 1994). Finally, finiteness is not a useful concept cross- 
linguistically, because it combines both person marking and tense marking, 
which need not be absent or present together (cf. Cristofaro 2007). 


different levels (as with his example of adverbial clauses, which instantiate both *subordinate 
clause" and *adverbial modifier"). It hardly seems felicitous to say that the Finnish Present tense 
is both a present tense and a future tense or that the Turkish Dative case is both a dative case 
and an allative case. For this reason, I have used the verbs "correspond to" and *match" for the 
relation between descriptive categories and comparative concepts rather than “be” or “instanti- 
ate". 

20 The term “converb” is defined in terms of the finiteness concept in Haspelmath (1995) and 
thus inherits its unsolved problems (see also van der Auwera 1998b on the definition of *con- 
verb”). 
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9 Commensurable description of different 
languages 


Moravcsik (2016: 421) asks whether descriptive categories are different for all lan- 
guages, even closely related languages such as French and Italian. And what 
about dialects or historical stages of a language? “Are relative clauses of Standard 
Modern English categorically different from those of the African-American Ver- 
nacular and also from those of Middle English?" (Moravcsik 2016: 421). And Dahl 
(2016: 430) asks a similar question: “If we accept that a category varies within 
one language, why can't it do so across languages?" 

The answer is that it depends on how we view and describe these languages, 
as different systems or as variants of a single system. Especially for closely related 
languages, describing them as variants of a single system makes good sense for 
practical purposes. This is what Gil (2016) calls the *unitary commensurable 
mode" of description. Adopting this mode means that the same categories are 
used and variation is described in an ad hoc way. Thus, for example, we could 
describe German and Modern English relativizers in the same way, as Relative 
Pronouns, regardless of their synchronic status within the system. We would then 
say that Modern English that is a relative pronoun (cf. van der Auwera 1985), like 
the German relative pronouns, and that it just happens to be case-invariant and 
identical to the complementizer that.” 

One could extend the unitary commensurable mode to languages even fur- 
ther away and this is, of course, what has traditionally been done, for instance, 
when linguists have said that the accusative in Swahili is expressed by word or- 
der or the vocative in English is identical to the nominative. Such descriptions are 
now universally thought to be cumbersome and ethnocentric and linguists agree 
that they do not do justice to the languages whose structure is not Latin-like. But 
such judgements are always somewhat subjective and I do not know how to 
achieve greater objectiveness in language description. As I noted in Section 4.1, 
description must primarily be comprehensive and it must include categories 


21 Another situation where two categories may be known by the same label is when they are 
cognate but not particularly similar anymore. For example, the Modern German Subjunctive 
mood has almost no functional overlap with the English Subjunctive (as in I insist that he come) 
but both are known by this name because they derive from the same Proto-Germanic form. The 
term "subjunctive" is not used as a comparative concept here but as a label for a cognate set, like 
“the *tün word“, a possible label for the cognate set comprising both English town and German 
Zaun ‘fence’, which derive from Proto-Germanic *tün. Cognate sets are united by common origin, 
not by any common features. 
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which strike a balance between elegance and comprehensibility. Uncontrover- 
sially, using the same categories for all languages leads to hopelessly inelegant 
descriptions,” so the issue of incommensurability arises whenever different lan- 
guage-specific categories are set up by researchers. Since the well-known Euro- 
pean languages English, Spanish, French, German and so on are very similar in 
their structure, incommensurability does not raise its head very often and many 
linguists blissfully ignore it. 

But when it does arise, as with the question whether Serbo-Croatian adnom- 
inal demonstratives are adjectives or determiners (cf. Bośković 2009), one needs 
to be aware that terms like “adjective” and “determiner” are either defined lan- 
guage-internally (in which case Bośković's question is a terminological question) 
or as comparative concepts (in which case Serbo-Croatian adnominal demonstra- 
tives would normally be treated as determiners, not as adjectives, because the 
latter are defined semantically, with respect to properties such as age, dimension, 
value and color). 


10 Universal claims pertain not to language 
structures, but to language phenomena 


Dahl (2016: 432) notes that “generalizations presuppose the possibility of making 
statements about individual cases”. Thus, corresponding to the universal in (8a), 
there must be a true language-particular statement as in (8b) and similar state- 
ments for all languages that have question-word movement. 


(8) a. Question-word movement is always to the left. (Haspelmath 2010: 671) 


b. In Swedish, question-word movement is to the left. 


Dahl (2016: 432) correctly observes that “if typological generalizations do not in- 
volve language-specific categories, these statements should also be free from 
such categories”. This may sound paradoxical, because (8b) would seem to be a 
statement about Swedish grammar and the rules of Swedish grammar are sup- 
posed to be stated in terms of language-particular descriptive categories. 


22 More precisely, this is uncontroversial outside of generative linguistics. In generative linguis- 
tics, not even the goal of comprehensive description (Section 4.1) seems to be shared, let alone 
the goal of readily comprehensible description. 


How comparative concepts and descriptive linguistic categories are different —— 107 


The paradox is resolved by noting that (8b) is a correct factual statement 
about the Swedish language but is not a rule of the Swedish language. The corre- 
sponding Swedish rule says that Question Words are moved to the Prefield Posi- 
tion (i.e., the position preceding the Finite Verb) and this rule is, of course, for- 
mulated in structural terms that presuppose other descriptive categories of 
Swedish.? The relationship between the Swedish rule and the factual statement 
in (8b) is that the rule makes it straightforwardly clear that the factual statement 
is true, i.e., there is a matching or correspondence relationship (but, of course, 
not an instantiation relationship). 

Very similarly, the universal in (9a) entails a statement such as (9b). 


(9) a. In almost all languages, the subject normally precedes the object when 
both are nominals. (Greenberg 1963, Universal 1) 


b. In Mandarin Chinese, the subject normally precedes the object. 


LaPolla (2016: 82) objects to the claim that Chinese is an SVO language - which 
is a more specific claim than (9b) but otherwise very similar — because he has 
shown in earlier work that Chinese does not have any subject or object category. 
LaPolla (2016: 370) thinks that *labeling [Chinese as an SVO language] implies 
that these categories either determine word order or are determined by it" (cf. 
LaPolla and Poa 2006). But again, this is not so. (9b) is a correct factual statement 
about Mandarin Chinese (assuming that “subject” means S/A, and “object” 
means P) and it is not a rule of Mandarin grammar.” LaPolla (2016: 370) may be 
right that *most people who see a description of Chinese as SVO will in fact as- 
sume that the label was given to the language because those categories are sig- 
nificant for determining word order in the language". But if they do, they have 
not understood the difference between describing a language and classifying a 
language from a comparative perspective. These two are different enterprises — 
not completely unrelated, because both are based on the phenomena of the lan- 
guage, but also not identical. 


23 Agenerativist might try to formulate both the universal in (8b) and the Swedish rule in terms 
of a cross-linguistic category (a natural kind, part of innate linguistic knowledge) such as “spec- 
ifier of C position". Such a view has indeed been popular (and may still be held by many) but 
there are very few cross-linguistic phenomena that support it. In the great majority of cases, 
question words are simply fronted, without any evidence for a C position (cf. Dryer 2005). 

24 Confusingly, LaPolla (2016) uses the expression “the facts of the language” in the sense in 
which I use “rules of the language" (this strange terminology may be motivated by his rejection 
of structuralism and the competence/performance distinction). 
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The notion of “factual statement” may be a bit surprising to some readers, 
because it seems not to have played an important role in typology so far. But I 
would argue that, implicitly, it has long been there. As part of their grammar- 
mining activities, typologists have generally considered the entire description of 
a language, not merely the part where the author describes a particular category. 
In many cases, considering the frequency of occurrence of a particular form or 
function is part of this. For example, Dobrushina, van der Auwera and Goussev 
(2005) say that they regard an inflectional form with subjunctive functions as an 
optative if “the expression of the wish is the main function”, which is presumably 
decided by frequency of use. Similarly, Dryer (2005) distinguishes between dom- 
inant order and lack of dominant order on the basis of frequency of use. 

Thus, what we compare across languages is not the grammars (which are in- 
commensurable) but the languages at the level at which we encounter them, 
namely in the way speakers use them. This is true not only for word order but also 
for cross-linguistic variation in semantic categorization. Studies based on etic 
comparative concepts such as translation questionnaires, visual stimuli and par- 
allel texts lead to groupings of comparative concepts into larger clusters and to 
semantic maps as seen in Figure 1. These etic concepts typically reflect uses to 
which the categories can be put, not different meanings, and they would not play 
a role in their semantic description. 

This is again similar to what is practiced in related disciplines: When anthro- 
pologists compare kinship terms, when political scientists compare political sys- 
tems and when economists compare economic activities, they must make refer- 
ence to what happens on the ground rather than to the incommensurable 
categories of the diverse cultures.” For linguistics, the relative independence of 
typology from description was already noted in Haspelmath (2004). 


11 Conclusion 


I conclude that there is a fundamental distinction between language-particular 
categories of languages (which descriptive linguists must describe by descriptive 
categories of their descriptions) and comparative concepts (which comparative 


25 These disciplines can make mistakes as well, of course. For example, comparative econo- 
mists can make the mistake of equating economic activities with legally recorded activities ex- 
pressed in money values, ignoring subsistence and “shadow” economies of various sorts. Such 
a failure may lead to a very distorted view of economic patterns. 
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linguists may use to compare languages). Language-particular categories are de- 
fined system-internally, by other language-particular categories, but compara- 
tive concepts are defined substantively, by other comparative concepts. The dis- 
tinction between system-internal categories and comparative concepts is found 
in the same way in other disciplines dealing with social and cultural systems and 
has been well-known in anthropology by the labels “emic” (for system-internal 
categories) and “etic” (for comparative concepts). I have also compared linguistic 
categories with natural kinds, as familiar from biology and chemistry, and I have 
argued that they are not natural kinds, because they do not recur across lan- 
guages with identical properties. Thus, it is not licit to use different criteria or 
symptoms for the identification of the same categories across languages. 

The widespread confusion between language-particular categories and cate- 
gory-like comparative concepts seems to derive from the fact that, for a signifi- 
cant part of the categories (“portable categories”), a characterization in substan- 
tive terms gets us fairly far (e.g., characterizing nouns in terms of things, persons 
and places). As a result, carrying over terms from one language to another lan- 
guage based on substantive similarities is often possible, sometimes without any 
serious difficulties. But it is universally recognized that, ultimately, linguistic cat- 
egories must be defined in structural terms (with respect to other constructions 
of the language), so the distinction does not disappear. 

Finally, I noted that, on the present view of comparative linguistics, what we 
compare is not language systems (which are incommensurable) but “the phe- 
nomena of languages”. 
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1 Introduction 


In recent years, a number of studies, such as Beyer (2009), Dryer (2009) and 
Devos and van der Auwera (2013), have drawn attention to some typologically 
striking properties of negation marking in languages of different parts of Africa. 
Dryer (2009) focuses on “neutral negation”, i.e., obligatory and productive (gen- 
eral) negation marking patterns in declarative verbal main clauses expressed by 
negation markers that are words, in languages with SVO order in Africa. He 
demonstrates that SVO languages in “an area in central Africa [stretching] from 
Nigeria across to the Central African Republic and down into the northern Dem- 
ocratic Republic of Congo”, as illustrated in Figure 1,’ significantly differ from 
SVO languages elsewhere in the world in that “the negative [word] follows the 
verb [instead of preceding it], typically occurring at the end of the clause, in 
SVONeg order" (Dryer 2009: 307). Dryer (2009) also points out that double nega- 
tion marking is widespread in this region. 


WĄDSA - Kadugii 
Katgha — Temein 


D ohm 
f i 

Tennet g 1 Pakot 
Lange" Tex 


D—pirelno 


Fig. 1: VO&VNeg languages in Africa, with their core area delineated (Dryer 2009: 323) 


Beyer (2009), in the same volume on negation patterns in West African languages 
(Cyffer, Ebermann and Ziegelmeyer 2009) as Dryer (2009), focuses specifically on 


1 The (red) line cutting off the southern-most point on the map has been added by me, as it must 
be a mistake. This point is labeled as Ngbaka (Ubangian) on other maps in Dryer (2009) but it 
has to be some Bantu language. Yet, no Bantu language from this area is mentioned in the paper. 
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double negation marking for “sentential negation” in a large group of West Afri- 
can languages centered on the Volta River basin, as illustrated in Figure 2. In 
most cases, the second of the two negation markers also happens to be clause- 
final. 


Double Negation Marking in Gur, Mande, Kwa, and Kru Languages 


——— j EN r 


Fig. 2: Double negation marking in West African languages centered on the Volta River basin 
(Beyer 2009: 222) 


Devos and van der Auwera (2013) is an in-depth study of multiple negation expo- 
nence in Bantu languages, a large group of languages spoken in a vast area from 
Cameroon and Kenia in the north all the way down to the South African Republic. 
They survey cases of multiple negation in Bantu languages, which is usually dou- 
ble but some exuberant examples of triple and quadruple negation marking are 
also attested. They also investigate recurrent sources for post-verbal negation 
markers. Many of these post-verbal negation markers happen to be also clause- 
final, as illustrated in Figure 3? 


2 In the Bantuist tradition, the term “post-final” used in Figure 3 refers to the position immedi- 
ately following the verb. 
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M16 Kongo « 


Kid Lwena q 


Fig. 3: Bantu double negation (Devos and van der Auwera 2013: 215) 


As these studies make clear, clause-final negation markers (CFNMs), although 
typologically rare, can be found in a very wide range of languages of Sub-Saharan 
Africa. Based on a sample of 618 African languages, I demonstrate in this paper 
that the spatial distribution of languages with CFNMs forms a clear areal pattern 
within Sub-Saharan Africa. At the same time, the spatial distribution of 462 lan- 
guages with post-verbal negation markers of any kind does not form any distinc- 
tive areal pattern, as it is virtually identical to the spatial distribution of all the 
languages of the sample as a whole. The two distributions overlaid with their spa- 
tial intensity plots are shown in Figure 4 and Figure 5 respectively? I am not able 
to plot the spatial distribution of multiple negation exponence on the scale of the 


3 Allthe plots and calculations for this paper have been produced with the software R (R Core 
Team 2015). The plots of spatial intensity and spatial interpolation have been produced with the 
package spatstat (Baddeley and Turner 2005). 
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continent at this point but I expect it to have a much less pronounced spatial 
structure than that of CFNMs. 


sT 
ex 


Fig. 4: Geographic distribution of the 462 languages of the sample with post-verbal negation 
markers and a plot of their spatial intensity 


Fig. 5: Geographic distribution of the 618 sample languages and a plot of their spatial intensity 
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Beside forming a clear areal pattern within Sub-Saharan Africa, on a world- 
wide scale CFNMs are also typologically much more unusual than post-verbal ne- 
gation markers and multiple negation exponence. Furthermore, as I argued else- 
where (Idiatov 2012a), CFNMs in Sub-Saharan Africa tend to be characterized by 
a number of peculiarities in their morphosyntax and diachronic development 
that set them apart from similar markers elsewhere in the world and offer im- 
portant clues as to an explanation of their observed areal distribution. Of course, 
some of these differences are more a matter of degree, yet some do seem to be 
more fundamental. For instance, CFNMs in African languages are often associ- 
ated with the presence of multiple negation exponence within a clause, most 
commonly double but sometimes also triple and occasionally quadruple. CFNMs 
in Africa often happen to be morphosyntactically deficient as compared to more 
canonical grammatical markers in being optional or lacking in some types of 
clauses as conditioned by the TAM value of the predicate of the clause, the sub- 
ordination status of the clause, the associated information structural and speech 
act type values or the discourse type that the clause belongs to (cf. Idiatov 2015). 
Diachronically, CFNMs in the area tend to be rather unstable and appear to be 
relatively easily borrowable (cf. Idiatov 2012b; 2015), unlike negators in other 
parts of the world but more like discourse markers, focus particles and phasal 
adverbs (cf. Matras 2009). 

All this makes CFNMs in Sub-Saharan Africa a particularly interesting mor- 
phosyntactic feature to explore from the perspective of language dynamics in 
space and time. This paper provides such a spatio-temporal analysis of the fea- 
ture CFNM in African languages. For reasons of space, I do not elaborate in the 
rest of paper on the explanation of why many negation markers are clause-final 
in Sub-Saharan Africa and why such negation markers are so common in partic- 
ularly this region. I treat these issues in more detail elsewhere (Idiatov 2012a; in 
prep.). So here is just the gist of the explanation, which goes as follows. The 
clause-final position of the negation markers is explained by their origin in other 
clause-final markers. The fact that CFNMs are so common in this region is related 
to another typological feature of many of the relevant languages, viz., a grammat- 
ical category of clause-final markers whose core function is the expression of in- 
tersubjective meanings. Combined with the fact that negation is exactly one of 
those situations, propitious for the use of intersubjective markers, where the 
speaker's assertive authority is at stake, frequency effects account naturally for 
the tendency to conventionalize clause-final negation markers. 

The paper is organized as follows. I begin by discussing in Section 22 various 
aspects of the definition of CFNMs adopted for this typology. This definition is 
rather inclusive because, as I explain in Section 2.1, my goalis to capture the most 
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of the synchronic diversity to achieve a better explanatory adequacy. However, 
for the reasons explained in Section 2.2, I leave negation constructions with nom- 
inal predicates out. Since I consider both obligatory and optional CFNMs, in Sec- 
tion 2.3, I address some of the issues that the distinction between the two may 
present. In Section 2.4, I elaborate on the meaning of the term “clause-final” 
within this typology. Related to the latter point is the issue of relevance of the 
relative order of object and verb for a typology of CFNMs. As I explain in Section 
2.5, this order is not relevant in the typology presented here, unlike for example 
in the typology of post-verbal negation markers by Dryer (2009), which is con- 
fined to languages with VO order. I briefly present my sample in Section 3. I also 
provide maps of the 618 languages of the sample as a whole and of the languages 
with and without CFNMs. The simple binary division into languages that have 
and languages that do not have CFNMs hides important diversity among the lan- 
guages with CFNMs. As I discuss in Section 4, in order to better capture this di- 
versity and thus get a better idea of the possible historical and spatial dynamics 
behind the observed pattern, I increase the degree of granularity of my data by 
taking into account two parameters, viz., obligatoriness of CFNMs and possible 
restrictions on the freedom to use CFNMs in different constructions. Section 5 pro- 
vides a discussion of the spatial and temporal dynamics reflected in the observed 
areal typological patterns of CFNMs in Africa. I first discuss in Section 5.1 the re- 
sults and potential pitfalls of two methods of spatial analysis and visualization of 
the distribution of the values of the feature CFNM in Africa, viz., spatial interpo- 
lation and generalized additive modeling (GAM). Both methods converge on the 
need to distinguish two focal areas of the feature CFNM. The first one, the Central 
Focal Area (CFA), is the most prominent of the two and spans the east of West 
Africa and parts of Central Africa, largely coinciding with Dryer’s (2009) core area 
of VO&VNeg languages reproduced in Figure 1. The second one, the Western Fo- 
cal Area (WFA), is less prominent and is restricted to West Africa. The two focal 
areas are separated by a major discontinuity around Ghana, Togo and Benin. In 
Section 5.2, I call onto other types of data to better calibrate the results of the 
spatial analysis produced in Section 5.1 and to identify the historical core of the 
CFA. Finally, Section 5.3 addresses the distribution of optional and/or restricted 
CFNMs in Africa, with a particular focus on the spread of CFNMs among Bantu 
languages to the south of the CFA, primarily in the Congo River corridor and the 
north of the Democratic Republic of Congo. 
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2 What kind of CFNMs are we looking at? 


2.1 An inclusive definition: synchronic diversity as a window 
on language change 


Morphosyntactic properties of negation constructions differ across languages. 
Similarly, the marking of negation may vary within a given language from one 
predicative construction to another. There are many different parameters along 
which the variation occurs. Depending on our goals and means, we can cut up 
this variation space in different ways. The definition that I adopt here is rather 
inclusive since my goal is to capture the most of the diversity. The rationale be- 
hind this is that synchronic diversity directly reflects the gradual nature of lan- 
guage change and thus offers us a window on the historical processes that 
brought about the current situation. Thus, one of the most common mechanisms 
known to be involved in the evolution of negation constructions, the so-called 
Jespersen cycle (cf. van der Auwera 2009, 2010 for a general overview; Devos and 
van der Auwera 2013 on Bantu languages), proceeds through a number of stages 
with most intermediate stages characterized by variation in the marking of nega- 
tion within a given construction. Typically, related languages do not proceed on 
this path in exactly the same manner. Both the synchronic variation within one 
language and the synchronic diversity of negation patterns within a group of re- 
lated languages offer an invaluable source of information on the earlier stages of 
the respective languages and on the processes underlying the change in negation 
constructions. 

For the purposes of the present study, I consider as CFNMs the elements that 
may be used in the right periphery of negative verbal predications with clause 
scope negation but that do not appear in the corresponding positive predications 
and whose position is determined with respect to the clause as a whole. A CFNM 
may be the sole marker of negation in the clause or just one of the exponents of 
negation marking distributed within the clause. That is, my typology is not con- 
fined to double negation-marking, like the typology of Beyer (2009). A CFNM may 
be a dedicated negation marker or may also encode other meanings, such as 
tense, aspect, mood and emphasis (in this respect, see also Section 2.3). For the 
purposes of my typology, the degree of morphological bounding of CFNMs is not 
relevant either. That is, they may be words (similarly to the negation markers in 
Dryer's 2009 typology), clitics, (supra)segmental affixes or non-linear morpho- 
logical operations. Similarly, my typology takes into consideration negation of all 
types of verbal clauses and is not restricted to the negation of declarative verbal 
main clauses, such as the “standard negation” typology of Miestamo (2008) or 
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the “neutral clausal negatives” typology of Dryer (2009). Negation of nominal 
predicates is beyond the scope of my typology for the reasons laid out in Section 
2.2. As discussed in Section 2.3, I consider both obligatory and optional CFNMs. 
The meaning of the description “clause-final” in CFNM is explained in more de- 
tail in Section 2.4. Section 2.5 further elaborates on the clause finality of negation 
markers from the perspective of different relative orders of object and verb. 


2.2 Beyond the scope of the typology: negation of nominal 
predicates 


I am not concerned here with negation constructions with nominal predicates. 
The reason is not that I do not deem them relevant. Clearly, it is important to 
equally take into account negation strategies used with nominal predicates if one 
wants to achieve a comprehensive diachronic account of clausal negation con- 
structions with verbal predicates. Thus, as pointed out by Croft (1991), negative 
existential markers may come to be extended to negative verbal predications 
within the so-called “negative-existential cycle”. However, from the perspective 
of an areal typology of CFNMs, negation constructions with nominal predicates 
tend to present rather different types of analytic problems. For instance, in the 
case of negative existential constructions (as distinguished from locative-pre- 
sentative ones; cf. Veselinova 2013) that use a dedicated negation marker without 
any distinct existential marker, the question of whether the position of this 
marker is determined with respect to the clause as a whole or the nominal predi- 
cate may be simply irrelevant. For equational and identification constructions, it 
may not always be obvious which nominal should be considered the predicate 
(cf. Bisang and Sonaiya 2000 on Yoruba constructions of the structure X ‘BE’ Y, 
where X and Y are nominals). In view of these complications, another reason why 
I decided not to include negation of nominal predicates in the areal typology of 
CFNMs here is that it is my strong impression that their inclusion would not affect 
significantly the areal pattern established solely on the basis of constructions 
with verbal predicates. Thus, I found only a few languages described with CFNMs 
only in negation constructions with nominal predicates but not in the ones with 
verbal predicates, such as Ngangela [nba] (Bantu K12; Maniacky 2003), which has 
an optional CFNM ko only with nominal predicates as illustrated in (1) versus (2), 
and Beiya [kmy] (Adamawa, Samba Duru; Littig and Kleinewillinghófer 2012), 
which has a CFNM ?wé only with nominal predicates as illustrated in (3) versus 
(4) (for both languages, the sources provide only examples of identificational 
constructions). Although I do not consider these languages as having CFNM for 
the purposes of the typology presented here, their addition would not disrupt the 
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general areal pattern established on the basis of negation constructions with ver- 
bal predicates only. 


(1) Ngangela* 


a. Kaci impweevó ko 
NEG.COP woman NEG 

b. Kaci impweevo 
NEG.COP woman 

c. kéci-ko impweevo 


NEG.COP-NEG woman 
‘It is not a woman.’ 
(Maniacky 2003: 192) 


(2 ko-tw-a-mween-e dingddmbe 
NEG-1PL-PRF-see.PRF-NEG COWS 
‘We have not seen the cows.’ 
(Maniacky 2003: 140) 


(3) Beiya 
yó yen küsén ‘wa 
COP thing bush NEG 
‘It is not a wild animal.’ 
(Littig and Kleinewillinghófer 2012: 6) 


2, 


(4) Min túúrá Falé 
1sG come\NEG PROP 
‘I do not come to Poli.’ 
(Littig and Kleinewillinghófer 2012: 6) 


4 The following abbreviations will be used here: 1,2,3 first, second and third person; COP copula; 
DAT dative; DEF definite; DEM demonstrative; EMPH emphatic; FUT future; INDF indefinite; IPFV im- 
perfective; LOG logophoric; NEG negation; NONHUM non-human; OBJ object; PFV perfective; PL plu- 
ral; POSS possessive; PO polar question; PRF perfect; PROG progressive; PROP proper name; PST past; 
QUO quotative; REFL reflexive; REL relative; SBJ subject; SBJV subjunctive; SG singular; STAT stative. 
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2.3 The issue of optionality 


I consider both obligatory and optional CFNMs. Obligatory CFNMs may be oblig- 
atory throughout negation constructions or be confined to a subset of those. Op- 
tional elements of negation constructions are taken into consideration in so far 
as their addition does not change the propositional meaning of the negative pred- 
ication or the constraints on their use are conditioned primarily by structural 
properties of their environment rather than their meaning (cf. Idiatov 2015 on the 
CFNM waa in the Mande language Dzuun [dnn]). Admittedly, it is not always pos- 
sible to make a clear-cut distinction along these lines precisely because language 
change is gradual. One of the frequent cases like this is represented by elements 
that are said to be optionally added to “emphasize” negation and are sometimes 
provided with translations such as ‘at all’. As a rule of thumb, I presume that if 
the author of a grammatical description deems it necessary to state that a nega- 
tion construction may contain a given optional element, this element is frequent 
enough in this construction for its original referential meaning to be sufficiently 
backgrounded. 

A different type of situation that is often conceived as involving optionality 
is when a default negation marker can be replaced by a negation marker that does 
change the propositional meaning of the negative predication and, for that rea- 
son, neither marker can be said to be obligatory as such, yet the presence of at 
least some such marker in the construction is required for the construction to be 
negative. In other words, it is the particular way of expressing negation within a 
negation construction that is obligatory but not the specific negation markers. 
French provides a good example of such a situation as a consequence of an on- 
going Jespersen cycle type evolution and loss of negative concord (cf. van der 
Auwera and Van Alsenoy 2016). In colloquial French, the older preverbal nega- 
tion marker ne is usually omitted and only the newer negation marker pas is used 
immediately following the verb, as in(5). The default negation marker pas can be 
replaced by a number of more specific markers, such as jamais ‘(n)ever’, as in (6), 
or nulle part ‘nowhere’, as in (7), and, although the latter elements do change the 
propositional meaning of the predication, at least some such element must be 
used in this constructional slot for the predication to remain negative. The alter- 
native English translations of (6) and (7) using never and nowhere respectively are 
closer to the French original and have similar origins as well (cf. Ingham 2013). 


(5 French 
Ele (ne) va pas. 
She | NEG goes NEG 
*She doesn't go.' 
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(6) Elle (ne) va jamais. 
She NEG goes never 
‘She doesn't ever go.’ or ‘She never goes.’ 


(7 Elle (ne) va nulle part 
She NEG goes nowhere 
‘She doesn't go anywhere.’ or ‘She goes nowhere.’ 


A somewhat more complicated example is provided by the Mande language 
Dzuun [dnn], as discussed by Idiatov (2015). Thus, Dzuun has a default CFNM 
waa, as in (8), which may be omitted under certain conditions. In addition, 
Dzuun has a number of CFNMs that are semantically narrower than the default 
CFNM wad, such as dé ‘anymore, no more’ and kirdd ‘(n)ever; (not) at all’. These 
specific CFNMs usually stand alone, as in (9), replacing waa just like jamais or 
nulle part replace pas in the French examples (6) and (7). However, occasionally, 
they can also be followed by waa, as in (10), or they can co-occur with each other 
when the negative meaning needs to be further specified, as in (11). Finally, some 
of the forms that function as specific CFNM markers can also occur in positive 
constructions, as illustrated with dé in (12), where it functions as an emphatic 
marker. In this respect, consider French jamais, which can also be used in posi- 
tive constructions, such as si jamais 'if ever' and pour jamais 'forever'. 


(8) Dzuun 
À naa wi © tsi waa 
3sG NEG.PST good 3SG.SBJV save NEG 
‘It was not good that he be saved.’ 
(Solomiac 2007: 270) 


(9) Wó dn nda, wo na bóma jada dé 
2sG enter come.IPFV 2SG NEG exit see.IPFV anymore 
‘You enter, but you do not find the exit anymore.’ 
(Solomiac 2007: 254) 


(10) Ta bwéy, bój ree naa nd re ye é 
DEM moment Elder PL  NEG.PST COP-3SG at  3PL.SBJV REFL 
sćrć küráá waa 
pray at.all NEG 
‘At that time, the elders did not want to pray at all.’ 

(Solomiac 2007: 256, 578) 
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(11) A naa fyà fvé dé küraà 
3sG NEG.PST fabric white anymore  at.all 
‘(When the chicken wanted to come with the white fabric, | it was not a white 
fabric anymore. 
(Solomiac 2007: 539) 


(12) A cd, á! ci min dzūnwēīnsíá mún san firiu dé 
3sG Quo ah! quo 1sG friend.DEF 1sG foot cheat.PFV EMPH 
‘He said: “Ah! My friend has really cheated me."' 

(Solomiac 2007: 483) 


As illustrated on the example of Dzuun, a marker need not be a dedicated nega- 
tion marker (be intrinsically negative in its meaning) to be considered a CFNM. 


2.4 The meaning of being clause-final 


The description *clause-final" in CFNM refers to the canonical position of the ne- 
gation marker on the extreme right periphery of a clause. A given negation 
marker need not be in the absolute clause-final position in every possible con- 
struction to count as a CFNM. What is relevant is that, in the clause where the 
verbal predicate is accompanied by two or more simple nominal arguments and 
one simple adjunct modifying the predicate, such as a simple place or time ad- 
verbial, the position of the negation marker is determined with respect to the 
clause as a whole and not with respect to the verbal predicate, its nominal argu- 
ments or its modifier. In a given language, the position of the CFNM with respect 
to other right periphery markers and verbal predicate modifiers may be fixed or 
depend on a range of factors, such as their scope, meaning, morphosyntactic 
structure and length. Again, as in the discussion of optionality of CFNMs, a clear- 
cut distinction along these lines may not be always possible because change is 
gradual (although, more often, the difficulty is caused by the lack of relevant ex- 
amples in the sources). 

A good example of possible complexities involved in the syntax of CFNMs is 
provided by three Eastern Mande languages of the Boko-Busa cluster, Boko [bqc], 
Busa [bqp] and Bokobaru [bus], whose CFNMs have the form -o (Boko) and -ro 
(Busa and Bokobaru). Like all Mande languages, the languages of the Boko-Busa 
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cluster have a strict SOVX constituent order in transitive constructions? and SVX 
in intransitive constructions, where X stands for *oblique", which is any constit- 
uent (an argument or an adjunct) other than S and O (cf. Creissels 2005). The ca- 
nonical position of the negation marker =(r)o is clause-final, as illustrated in (13). 
However, other right periphery elements with clausal scope, such as the polar 
question marker =å, follow the CFNM -(7)o, as illustrated in (14). Furthermore, 
“sentence level adverbial phrases and clauses may follow the negative marker" 
(Jones 1998: 299), as illustrated in (15), which can be compared to (14). The ten- 
dency for adverbials to follow the negation marker -(r)o is more general in Busa 
and Bokobaru while, in Boko, it is especially longer adverbials that are affected. 
Finally, the negation marker =(r)o can be followed by the second coordinate in 
the alternative coordination construction, as in (16). This may be analyzed as a 
result of ellipsis, as is done by Jones (1998: 298). Alternatively, it may be seen as 
extraposition of a constituent to the right periphery because heavy constituents, 
such as the ones involving coordination, are dispreferred in argument positions 
(subject, object, postpositional phrase). This would not actually be uncommon in 
Mande and it would also parallel the tendency to place longer adverbials after the 
CFNM -(7)o. 


(13) Boko 
1 i gbé pi? kã lá álé 
fluid NEG.PFV person  that-PL intoxicate as 2PL.PROG 
e wàá-o 


see  like-NEG 
‘Drink has not intoxicated those people as you are thinking.’ 
(Jones 1998: 301) 


(14) 'ásí álé ma nódi ke e tia=o=a? 
SO  2PL.PROG 1SG.POSS trust make until  now=NEG=PQ 
‘So you are still not trusting me?’ (lit. ‘So you are not making my trust until 
now. ') 
(Jones 1998: 299) 


5 Unlike most other Mande languages, the languages of the Boko-Busa cluster also allow null 
objects with anaphoric reading but only when the referent is non-human and only in non-per- 
fective constructions, as well as perfective constructions with nominal subjects or a third person 
plural pronominal subject (Jones 1998: 212-213). 


An areal typology of clause-final negation in Africa —— 129 


(15) aa mé wa 4 mi=o e go pó 
3PL.PFV  Say.PFV 3PL.LOG.FUT water drink=NEG until time REL 
wa aa dé 


3.INDF.PFV 3SG.OBJ kill.PFV 
‘They said they would not drink until the time when he was killed.’ 
(Jones 1998: 299) 


(16) má 'ésé vi-o ge mdsć 
1SG.STAT sorghum  have-NEG or maize 
‘I don't have any sorghum or maize.’ 
(Jones 1998: 299) 


The canonical position of the negation marker =(r)o in Boko-Busa is clause-final 
and this is how it is classified within this typology. At the same time, the observed 
synchronic variation in its placement is indicative of an ongoing diachronic pro- 
cess of the negation marker being attracted to the immediately post-verbal slot.* 


2.5 CFNMs and the relative order of object and verb 


Unlike in the typology of post-verbal negation markers by Dryer (2009), which is 
confined to languages with VO order, the relative order of object and verb is not 
relevant in the typology of CFNMs presented here. The object can either precede 
the verb as in the Dzuun and Boko-Busa examples above or follow it, as in the 
Gbaya Kara [gya] (Gbaya-Manza-Ngbaka) example in (17). 


6 The attraction of the negation marker in Boko-Busa from its original clause-final slot toward 
the immediately post-verbal one is likely to have been triggered by substrate influence of 
Baatonum, a Gur language spoken immediately to the southwest of Boko-Busa. In Baatonum, 
negation markers are mostly preverbal, except in the negative perfective construction, where the 
preverbal negation marker is complemented by a verbal suffix (Winkelmann and Miehe 2009: 
181-182). In Mande, the placement of negation markers varies but CFNMs are never attracted to 
the immediately post-verbal position, as it is not the position associated with polarity marking 
in Mande languages. Furthermore, there are sufficient reasons to assume that a substantial part 
of the current Boko-Busa populations shifted to Boko-Busa from Baatonum at some point in the 
past. For instance, Jones (1998: 5) points out the clear relation between the Boko-Busa terms for 
the non-royal Boko-Busa people (‘peasants’, ‘vassals’, ‘slaves’) and the Boko-Busa designations 
of the Baatonum. 
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(17) Gbaya Kara 
2am gbé sadi ha k66  kóm non na 
1SG killuPFV animal so.that wife POSS.1SG eat\IPFV NEG 
‘I did not kill game to feed my wife.’ (lit. ‘so that my wife eats’) 
(Roulon-Doko 2012 : 5) 


What is relevant for my typology is that the position of the negation marker on 
the right periphery is determined with respect to the clause as a whole. Construc- 
tions with VO order and constructions with OV order may present different types 
of analytic problems for determining whether the negation marker is clause-final 
in this sense or not. 

As pointed out by Dryer (2009: 319), in those (Sub-Saharan) African lan- 
guages with VO order where the negation marker follows the object it “predomi- 
nantly” also follows “any adverbs or adjunct phrases”. In other words, it is typi- 
cally a CFNM. In this respect, Sub-Saharan African languages differ from 
languages with VO order and the negation marker following the object elsewhere 
in the world, such as German, the language cited by Dryer (2009) as an example. 
A rare example of a language from Sub-Saharan Africa similar to German is Jur 
Módó [bxe] (Bongo-Bagirmi; Andersen 1981; Persson and Persson 1991), spoken 
in South Sudan on the periphery of the core CFNM area (cf. Figure 8, 10 or 11). Jur 
Módó uses SVX order in intransitive constructions and SVOX order in transitive 
constructions. The slot immediately at the end of the verb phrase, viz., after V in 
intransitive construction and after O in transitive construction or, framed differ- 
ently, immediately before the X slot, appears to be reserved in Jur Módó for at 
least two grammatical markers, one of which is the negation marker dé, as illus- 
trated in (18) and (19), and the other one is the resultative or perfect marker déni 
(called “perfective” by Andersen 1981 or “completive” by Persson and Persson 
1991, as illustrated in (20). 


7 Contrary to Dryer's (2009: 320) statement that the negation marker “can be freely positioned 
among adverbial or adjunct elements", Andersen (1981) and Persson and Persson (1991) only 
specify that the negation marker is a type of “adverb” which should “occur in the adjunct [posi- 
tion], separated from the verb by the object" (Persson and Persson 1991: 15). Yet, in all the exam- 
ples found in these sources, the negation marker is always the first of the “adverbs” or “ad- 
juncts", immediately following the verb or, if present, the object. 
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(18) Jur Módó 
m-ód5 nd5b5 dé kpé tí-i 
1SG-do work NEG again  with-2sG 
‘I won't work with you again.’ 
(Andersen 1981: 59) 


(19) móró  ílábá dé r» kòbì 
spear fall NEG at buffalo 
‘The spear did not hit the buffalo.’ 
(Andersen 1981: 80) 


(20) kirábà  Ope kómó  dénņí di mi málibiwü 
jackal release hare PRF from in snare 
‘Jackal released Hare from the snare.’ 

(Persson and Persson 1991: 15) 


Beside the typical situation in African VO languages, where the clause-final 
status of a negation marker is relatively straightforward, we also find a number 
of VO languages on the periphery of the core CFNM area (cf. Figure 10 or 11), 
where a negation marker gravitates towards the end of the clause but it is not 
obvious whether its canonical position should be characterized as clause-final or 
not. One of the clearest examples of such a language is Nzadi [no code] (Bantu 
B865; Crane, Hyman and Tukumu 2011), whose description provides a detailed 
overview of the syntax of the post-verbal negation marker. In Nzadi, the negation 
is marked in two positions in the clause with the first marker occurring before the 
verb “in the auxiliary” (the form of this negation marker depends on the TAM 
values) and the second marker, bo, occurring after the verb “towards the end of 
the clause” and taking scope “over any of the elements” of the clause (Crane, Hy- 
man and Tukumu 2011: 169, 173). (21) schematizes the possible positions of bo in 
various clause structures. 
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(21) Nzadi: the possible positions of the post-verbal negation marker bo (Crane, 
Hyman and Tukumu 2011: 171)? 


S-V-b5 *S-b5-V *b5-S-V 
S-V-O-b5 ?S-V-bo-O 

S-V-IO-DO-bo S-V-IO-bo-DO *2S-V-bo-IO-DO 
S-V-DO-Obl-bo ?5-V-DO-bo-Obl *2S-V-bo-DO-Obl 


S-V-DO-Oblsa.-b5 S-V-DO-b9-Oblben *S-V-bo-DO-Oblken 
S-V-X-bo S-V-bo-X 


For the purposes of my typology, negation markers similar to Nzadi bo are classi- 
fied as optionally clause-final. From a diachronic perspective, such indetermi- 
nacy suggests an ongoing syntactic change whereby a negation marker that, by 
virtue of its etymology, has originally evolved in a certain slot in the clause struc- 
ture is being attracted to a different slot in the clause structure, presumably be- 
cause this slot is associated with the expression of certain types of meanings.’ 


8 The asterisk <*> marks ungrammatical options. Elsewhere in the source, the examples of bo 
placement options marked with the combination <*?> are also characterized as *ungrammati- 
cal”, so it remains unclear what difference between <*> and <*?> was intended by the authors in 
this table. The question mark <?> marks options that are characterized as “strongly dispreferred” 
or “at least marginally acceptable”. S-V-IO-DO stands for ditransitive “double object construc- 
tions", where the indirect object is unmarked. S-V-DO-Obl stands for ditransitive “indirect object 
constructions”, where the indirect object, referred to as oblique, is introduced by the locative 
preposition kó. I added to the original table the row with the benefactive oblique (Oblben) marked 
by sám +é N (lit. ‘reason of N’), because it differs from the obliques introduced by the preposition 
kó in that “the preferred ordering may place bo before the benefactive", although without any 
“strong preference either way” (Crane, Hyman and Tukumu 2011: 170). Finally, *X can be a non- 
object complement, or any adjunct, and may co-occur with direct and indirect objects [including 
obliques], with bo placement restricted with regard to objects as in other cases" (Crane, Hyman 
and Tukumu 2011: 171). 

9 The original position of the Nzadi marker bo is probably after the indirect object, either the 
unmarked one or the one introduced by the preposition kó or, in the absence of such an indirect 
object, after the direct object or, when no object is present, after the verb. That is, it is now being 
attracted to the clause-final position, arguably because of its default clausal scope and its inter- 
mediary function as attenuator of the assertive strength of the negative predication as a whole. 
Its original placement can be explained by its likely etymology as a possessive pronoun, which 
used to be coreferential with the subject and functioned as a kind of attenuator, something like 
‘as for Xi [Si does not P]’, which can be roughly compared to some uses of emphatic pronouns in 
French, as in Pierre ne sait pas, lui ‘Pierre does not know (while others might know)’ (lit. ‘Peter 
does not know, him"). Given the shape of bo, it is most likely the third person plural form that 
has become generalized. As described by Devos and van der Auwera (2013), possessive pronouns 
are not uncommon as a source of secondary negation markers in the Bantu languages of the 
area. 
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African languages with OV order and a post-verbal negation marker can be 
subdivided into two groups for the purposes of my typology. In the first group, 
the verb is normally followed by some constituents (arguments or adjuncts) other 
than the object. Most such languages seem to behave like the Mande languages 
Dzuun and Boko presented in Sections 2.3 and 2.4 in that the post-verbal negation 
marker also follows other post-verbal constituents and thus can be characterized 
as clause-final. Most such languages are in fact Mande. In the second group, the 
clause is basically verb-final so that, in principle, the question of whether the 
post-verbal negation marker is oriented toward the clause as a whole or just the 
verb is not particularly meaningful. However, where some diachronic evidence is 
available, it is usually clear that the post-verbal negation marker is oriented to- 
ward the verb and not the clause as it often originates in a main verb reanalyzed 
as an auxiliary (cf. van Gelderen 2008: 232-233; Lucas 2009 on Afro-Asiatic lan- 
guages). Therefore, by default, the post-verbal negation markers in such lan- 
guages are not characterized as clause-final for the purposes of my typology. This 
situation is common among the Afro-Asiatic languages of northern and eastern 
Africa (Cushitic, Omotic, Semitic), as illustrated in (22) from Dhasaanac [dsh] 
(Cushitic; Tosco 2001), in some Nilo-Saharan groups in Chad and Sudan (such as 
Saharan, Fur and Nubian) and in Dogon and Ijoid languages in western Africa, 
as illustrated in (23) from Jamsay [djm] (Dogon; Heath 2008). 


(22) Dhasaanac 
yáa rum ma  ká Suggun-in 
1SG.sB] children NEG here bring.IPFV-NEG 
‘Iam not going to bring the children here.’ 
(Tosco 2001: 299) 


(23) Jamsay 
Óyóró kó-rü y2w3-l-á 
quickly | NONHUM-DAT  accept-PFV.NEG-3PL.SBJ 
‘They did not readily accept it [= plow].’ 
(Heath 2008: 368) 


3 The data 


The data for this study come from individual grammatical descriptions comple- 
mented by a number of existing typological surveys of negation patterns in Af- 
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rica, such as Dryer's (2009) survey of post-verbal negation markers in the VO lan- 
guages of Central Africa, Devos and van der Auwera's (2013) study of multiple 
negation marking in Bantu languages” and Beyer's (2009) study of double nega- 
tion marking in the languages of the area centered around the Volta River basin 
in western Africa. I tried to cross-check the information coming from typological 
surveys in grammatical descriptions whenever possible. 

My sample consists of 618 languages, of which 256 languages appear to use 
some kind of CFNM while 328 languages clearly lack a CFNM and, for 34 lan- 
guages, the information available was not sufficient for an informed decision. For 
most purposes, I combined the latter two groups as languages without CFNMs 
(362 languages). The geographic distribution of the 618 languages of my sample 
is presented in Figure 5 in Section 1. Figure 5 also represents this distribution as 
spatial intensity, that is, the degree of concentration of languages taken as points 
in space. The most important concentration of languages is found in the area 
around the border between Cameroon and Nigeria. Another area of high concen- 
tration of languages stretches from Togo into the southwest of Burkina Faso. Fig- 
ure 6 shows the geographic distribution and the spatial intensity of the 256 lan- 
guages that have CFNMs and Figure 7 of the 362 languages that do not have 
CFNMs. The overall pattern of distribution of languages with CFNMs in Figure 6 
resembles the pattern of distribution in the sample as a whole in Figure 5. The 
pattern in Figure 6 is, however, more spatially circumscribed in almost all direc- 
tions. It is basically restricted to northern Sub-Saharan Africa. Its focal area, alt- 
hough equally situated in the area around the border between Cameroon and Ni- 
geria, has a relatively northern position and its westward extension towards 
southwestern Burkina Faso is somewhat less pronounced and has a more clearly 
latitudinal east-west orientation (as opposed to a more southeast-northwest ori- 
entation in Figure 5). 


10 I am grateful to Maud Devos and Johan van der Auwera for providing me with the source 
database they created for that survey. 
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Fig. 7: Geographic distribution of the 362 languages without CFNMs and their spatial intensity 


The pattern of distribution of languages without CFNMs in Figure 7 is quite dif- 
ferent, especially in its eastern part. To begin with, the distribution in Figure 7 is 
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much more spread out. Importantly, it is characterized by a clear depression in 
Central Africa where the pattern has a U-shaped curve. This depression in the 
pattern in Figure 7 is created by the presence of a relatively homogenous cluster 
of languages with CFNMs in that area, whose importance may not be obvious 
from Figure 6. The depression (and the cluster of languages with CFNMs that cre- 
ates it) is northeast-southwest oriented and extends from the Central African Re- 
public along the Congo River, corresponding on the map to the border between 
Congo and the Democratic Republic of Congo. This depression makes the pattern 
go southward in Cameroon, parallel to the coast towards the lower reaches of the 
Congo River, where it then turns eastward and finally turns back around central 
Democratic Republic of Congo in a northeast direction toward Ethiopia. It is in- 
teresting to compare the West African focal area of Figure 7with that of Figure 6. 
The focal area in Figure 7 is both wider and more pronounced, stretching in a 
southeast-northwest orientation similar to what we find in the sample as a whole 
in Figure 5. Similarly to the focal area in Figure 6, the eastern end of the focal area 
in Figure 7 is situated around the border between Cameroon and Nigeria but it 
has a clearly more southern position, spanning southeastern Nigeria and south- 
ern Cameroon. 


4 Increasing the granularity in the data: 
obligatoriness and constructional freedom 


The patterns of geographic distribution of the languages of the sample presented 
in Figure 5 (Section 1) and Figures 6 and 7 in Section 3 are basically point patterns. 
As such, they show us the overall extent of the languages with and without 
CFNMs and highlight the regions of high and low concentration of the two types 
of languages. This is valuable information for a first approach to the phenome- 
non. However, this binary representation hides important diversity among the 
languages with CFNMs. In order to better capture this diversity and thus get a 
better idea of the possible historical and spatial dynamics that brought about the 
current situation, we need to increase the degree of granularity of our data. Fol- 
lowing the discussion of the different issues involved in delimiting CFNMs in Sec- 
tion 2, I will use two parameters. The first one is obligatoriness of CNFMs, that is, 
whether CFNMs are obligatory or optional, and the second one are possible re- 
strictions on the freedom to use CFNMs in different constructions. The two rank- 
ing options of these parameters and the pseudo-numerical values assigned to 
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them are summarized in Table 1." The last column provides the numbers of lan- 
guages of each type. CFNMs that are obligatory and free from constructional re- 
strictions are ranked highest, as 4, in both ranking options, while CFNMs that are 
both constructionally restricted and optional are ranked lowest, as 1. 


Tab. 1: Constructional restrictions and optionality: two ranking options 


Constructional Constructional Obligatoriness Obligatoriness Number of 


freedom highest freedom highest languages 
0 no CFNMs 0 328 

0.5 unclear 0.5 34 

1 restricted optional 1 7 

2 unrestricted optional 2 22 

3 restricted obligatory 3 31 

4 unrestricted obligatory 4 196 


In principle, either of the two parameters could be ranked first. It so happens that, 
for this particular distribution of languages with CFNMs, which is highly skewed 
to one side, both options produce very similar results. Nevertheless, I have a prin- 
cipled preference for ranking obligatoriness highest because I conceive obligato- 
riness as the defining property of grammatical meanings (see Idiatov 2008 for a 
detailed discussion). Lack of constructional restrictions on the use of a grammat- 
ical marker is a property of canonical grammatical markers (in the sense of ca- 
nonical typology; cf. Brown, Chumakina and Corbett 2013. Therefore, CFNMs that 
are both obligatory and free of constructional restrictions are canonical grammat- 
ical markers whereas other types of CFNMs fall short of such status. 

An important point to be mentioned with respect to the classification in Table 
1is that it classifies languages, not CFNMs. If a language has several CFNMs that 


11 The values are pseudo-numerical in the sense that their numeric values are basically arbi- 
trary and are just intended to reflect the relative order of the different combinations of the pa- 
rameters. In fact, whether we use these numeric values or just an ordered list of factors does not 
matter that much. The two methods give very similar results in terms of spatial analysis, for in- 
stance, when we visualize them using spatial interpolation (see Section 5.1). However, I prefer to 
use the pseudo-numeric values because they allow to better capture the relative status of lan- 
guages that do not have CFNMs as opposed to the different types of languages that have or may 
have CFNMs. See also Section 5.1 for an alternative coding scheme for pseudo-numeric values 
applied in generalized additive modeling. 
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differ with the respect to the two parameters, I choose the CFNM that ranks high- 
est, as the closest to being a canonical grammatical marker, to represent the lan- 
guage as a whole. Admittedly, this way, some of the diversity fails to be properly 
reflected in the typology but I do not see very well how I can incorporate this in- 
formation. Furthermore, I also have the impression that adding it would not have 
significant effects on the overall results. 


5 Areal typology of CFNM in Sub-Saharan Africa 


In this section, I first discuss the results and potential pitfalls of two methods of 
spatial analysis and visualization of the distribution of the values of the feature 
CFNM in Sub-Saharan Africa, as well as some geographic correlations that 
emerge from this analysis (Section 5.1). In particular, I apply spatial interpolation 
(using two different types of smoothing, kernel smoothing and inverse-distance 
weighted smoothing) and generalized additive modeling (GAM). The different 
methods used converge on the same spatial pattern of the feature CFNM. They 
confirm the existence, the position and the overall shape of two focal areas, the 
Central Focal Area (CFA) spanning the east of West Africa and parts of Central 
Africa and the Western Focal Area (WFA) restricted to West Africa. The two areas 
are separated by a major discontinuity around Ghana, Togo and Benin. Of the two 
focal areas, the CFA can be called the primary focal area, given its prominence, 
and the WFA a secondary focal area. In Section 5.2, I address the issue of the his- 
torical core of the CFA. In particular, I argue that, despite the apparent promi- 
nence of an area in southern Chad and the Central African Republic within the 
CFA, it cannot represent its historical core and that it is much more likely that the 
primary historical core of the CFA is situated immediately to the northwest of the 
Central African Republic along the Benue River corridor going from southern 
Chad through northern Cameroon into central Nigeria. At the same time, as dis- 
cussed in Section 5.3, this area in southern Chad and the Central African Republic 
prominent within the CFA must have served as the source for the spread of the 
feature CFNM among Bantu languages further south in the Congo River corridor 
and the north of the Democratic Republic of Congo. Section 5.3 further offers a 
discussion of the broader issue of the distribution of optional and/or restricted 
CFNMs in Africa and argues that, as expected, such grammatically non-canonical 
CFNMs tend to be peripheral areally as well. 
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5.1 Spatial analysis: spatial interpolation, generalized 
additive modeling and correlations with geography 


The spatial distribution of languages with different kinds of CFNMs (as distin- 
guished in Section 4) and languages without CFNMs can be inspected in a num- 
ber of ways. The most straightforward option is to visualize the data by means of 
spatial interpolation, which I perform here using the pseudo-numeric values de- 
scribed in Section 4. We can also use an alternative coding scheme for pseudo- 
numeric values, as we will do for generalized additive modeling further in this 
section, or use an ordered list of factors without any noticeable impact on the 
results. Thus, Figure 8 shows the result of spatial interpolation using kernel 
smoothing and Figure 9 shows the result of spatial interpolation using inverse- 
distance weighted smoothing. 


Fig. 8: The spatial interpolation graphic of the different values of the feature CFNM (as de- 
scribed in Section 4) using Gaussian kernel smoothing (the default bandwidth value adjusted 
by 1.3) 
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Fig. 9: The spatial interpolation graphic of the different values of the feature CFNM (as de- 
scribed in Section 4) using inverse-distance weighted smoothing (power = 6) 


Different spatial interpolation methods produce slightly different visualizations, 
which may allow to better highlight different aspects of the spatial distribution. 
Thus, the interpolation using kernel smoothing in Figure 8 is somewhat better in 
visualizing the overall structure of the spatial distribution of the feature CFNM. It 
clearly shows a major discontinuity in the distribution of languages with CFNMs 
in Northern Sub-Saharan Africa (NSSA) around Ghana, Togo and Benin. This dis- 
continuity cuts off a secondary focal CFNM area that is centered on the region 
where the borders of Burkina Faso, Mali and Ivory Coast come together. For ease 
of reference, I refer to this secondary CFNM focal area in NSSA as the Western 
Focal Area (WFA) and to the main CFNM focal area to the east of it as the Central 
Focal Area (CFA). Both the interpolation using kernel smoothing in Figure 8 and 
the interpolation using inverse-distance weighted smoothing in Figure 9 show 
that the CFNM areas in NSSA are largely confined to the hinterland. The location 
of the three clearest extensions of the CFNM area toward the Gulf of Guinea coast 
is somewhat better visible in Figure 9. Thus, the first such extension is found 
around southern Togo and Benin, largely bridging the gap between the CFA and 
the WFA. The second coastal extension is located in the south of central Nigeria 
and is formed by the southward spread of the Edoid languages. While the first 
two coastal extensions are themselves likely to result from relatively recent lan- 
guage spread and/or contact events, the gap between them may be just as well 
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accidental. Thus, the gap area is occupied by one big language (or a cluster of 
closely related lects), viz., Yoruba, that must have expanded into the gap area 
from a more hinterland location in central Nigeria relatively recently and the re- 
spective proto-language may have simply happened to lack the CFNM feature by 
chance or lost it when moving into the area. In this respect, note that CFNMs are 
found in various related languages spoken just outside of the gap area, such as 
many Edoid languages or Igala, which belongs to the same lower-level Yoruboid 
linguistic grouping as Yoruba. The third coastal extension along the Congo River 
corridor is due to relatively recent language and/or population movements out of 
Central Africa affecting the Bantu languages in that area (see Section 5.3). 

Depending on how the underlying data is distributed in space exactly, spatial 
interpolation may produce certain visualization artefacts that one should be 
aware of when analyzing the results. Thus, both methods exaggerate to different 
extents the prominence of a number of regions, such as the region where the bor- 
ders of the Democratic Republic of Congo, Angola and Zambia come together, the 
region in central Mozambique and the region toward the northeast of South Su- 
dan. These exaggerated prominence regions are due to the fact that the sample 
data points are sparsely distributed in these regions (either because there are 
simply fewer languages or because the languages were not sampled). To see why 
this may affect visualization, we can represent data points by peaks whose height 
corresponds to the numeric value of the types in Table 1. The peaks representing 
a few isolated examples of CFNM languages in such a region would get very wide 
slopes when other data points are far. Several low prominence regions in the Sa- 
hara in Figure 9 are basically due to the same reason and do not correspond to 
any real languages, as becomes clear when we compare Figure 9 to Figure 6. That 
these spurious prominence regions in the Sahara are absent in Figure 8 is just due 
to the way the sample data points happen to be distributed on the southern 
fringes of the Sahara and the bandwidth value chosen for kernel smoothing.” Fi- 
nally, care should be exercised when interpreting the region of high prominence 
(as reflected by its darker shading) within the CFA in southern Chad and the Cen- 
tral African Republic in Figure 8. Although it is tempting to interpret it as the core 
or hotbed of the CFA, as discussed in Section 5.2, a different interpretation may 
be more appropriate. In this respect, note that this region of high prominence is 
absent in Figure 9, which uses a different spatial interpolation method. 


12 In this respect, note a number of slight northward spikes in Figure 8, such as the spikes in 
central Chad and southeastern Niger, which correspond to much clearer spikes in the same 
places in Figure 9. 
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Although spatial interpolation is a valuable visualization tool, it does not 
produce quantifiable results. A statistical tool well-adapted for spatial analysis 
that can do that is provided by generalized additive models (GAM),” for instance, 
as implemented in the mgcv package for R (Wood 2015). There are different ways 
to code our response variable, viz., the type of the feature CFNM. One option 
would be to code it as an ordered categorical variable. The big disadvantage of 
this option would be that it does not reflect the importance of the divide between 
the absence and presence of CFNMs in the language and the certain hierarchy 
between the six values of the feature CFNM of Table 1. For this reason, I will not 
use this coding. The results it produces are actually largely comparable to the 
other two options that we are going to consider, although less clear-cut. The latter 
two options both involve pseudo-numeric values assigned to the six values of the 
feature CFNM. The first coding scheme simply reuses the numbers from Table 1 
(with obligatoriness ranked highest). The other coding scheme uses a scale from 
0 to 1, with O corresponding to the absence of CFNMs and 1 corresponding to the 
presence of canonical CFNMs (type 4 in Table 1), which may better capture the 
hierarchy between the six values of the feature CFNM. The two coding schemes 
are compared in Table 2. 


Tab. 2: Two coding schemes for pseudo-numeric values of the feature CFNM 


Constructional freedom — Obligatoriness Scheme 1 Scheme 2 
(ranked highest) 


no CFNMs 0 0 

unclear 0.5 0.5 
restricted optional 1 0.625 
unrestricted optional 2 0.75 
restricted obligatory 3 0.875 
unrestricted obligatory 4 1 


13 GAM is an extension of multiple regression that provides flexible tools for modeling complex 
interactions describing wiggly surfaces. A practical introduction to GAMs for linguists can be 
found in Baayen (2013), Tamminga, Ahern and Ecay (2016) and Winter and Wieling (2016). Some 
good examples of use of GAMs in linguistics in relation with spatial analysis are provided by 
Wieling, Nerbonne and Baayen (2011) and Wieling et al. (2014). Idiatov and Van de Velde (2016, 
2018) apply GAM for spatial analysis of lexical frequencies of labial-velar stops in NSSA. 
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We will consider two GAMs, one using Scheme 1 and one using Scheme 2. Both 
GAMs estimate the values of the feature CFNM as a function of the combination 
of longitude and latitude using thin-plate regression splines. The plots in Figure 
10 and Figure 11 represent the regression surface of the two GAMs produced using 
Gaussian distribution as a contour plot with the heat map color scheme. In the 
heat map color scheme, the lighter the color, the higher the temperature, which, 
in our case, corresponds to a higher pseudo-numeric value of the feature CFNM. 
The contour lines are isopleths that mark deviations from the mean in terms of 
standard deviation. 


Fig. 10: A contour plot with the heat map color scheme visualizing a GAM produced using 
Scheme 1 for coding of the feature CFNM (k=13, family = Gaussian, edf = 41.73, p « 2e-16, devi- 
ance explained = 43.7%, AIC = 2234) 
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Fig. 11: A contour plot with the heat map color scheme visualizing a GAM produced using 
Scheme 2 for coding of the feature CFNM (k=13, family = Gaussian, edf = 42.72, p < 2e-16, devi- 
ance explained = 42.5%, AIC = 552) 


The two GAMs produce very similar results. Basically, they only differ in their 
Akaike information criterion values (AIC), with the GAM based on Scheme 2 hav- 
ing a much better AIC. However, this does not so much reflect the quality of the 
respective model as the difference in the coding scheme used, arguably favoring 
Scheme 2. 

The GAM plots in Figure 10 and Figure 11 are very similar to the spatial inter- 
polation plots in Figure 8 and Figure 9, minus the visualization artefacts pro- 
duced by spatial interpolation and discussed above. The different methods used 
converge on the same spatial pattern of the feature CFNM. They confirm the ex- 
istence, the position and the overall shape of two focal areas, the CFA and the 
WFA, separated by a major discontinuity around Ghana, Togo and Benin. 

Of the two focal areas, the CFA can be called the primary focal area, given its 
prominence, and the WFA a secondary focal area. The overall shape of the two 
focal areas can be broadly described as the hinterland of the Gulf of Guinea. Start- 
ing with the WFA in southern Mali and northern Ivory Coast, the region where 
the feature CFNM is prominently present extends eastward, primarily following 
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grasslands and woodland savannahs north of the forest zone, for the most part 
staying outside of the coastal regions. It is interrupted by only one major discon- 
tinuity separating the WFA from the CFA. Geographically, this major discontinu- 
ity corresponds rather well to the so-called Dahomey Gap, a southward savannah 
corridor interrupting the zonal West African rain forest. This may look strange at 
first, as the WFA and the CFA themselves are in the savannah zone. However, the 
north-south orientation of this savannah corridor can also be seen as conducive 
to interrupting the general east-west dynamics of the population and language 
movements in the western part of NSSA. Thus, I believe that this discontinuity is 
primarily due to the combined effect of the southward spread of Songhay into the 
current gap area from the north and the northward spread of the Tano subgroup 
of Kwa from the coastal regions in the south. 

The slight southeastward bent of the CFA in southern Chad toward the Cen- 
tral African Republic follows well the orientation of the relevant ecological zones, 
topography and hydrography of this part of NSSA, as illustrated in Figure 12 ona 
relief map of Africa.“ The CFA further marginally spills over into South Sudan in 
the east and, much more significantly, into equatorial Central Africa in the south- 
west along the Congo River corridor (see Section 5.3). The two plausible zones 
through which the interaction between the CFA and these neighboring areas is 
likely to have primarily occurred are marked in Figure 12 as A and B respectively, 
with the difference in the degree of interaction graphically represented by the dif- 
ference in font style of the two symbols. 


14 The eastern borders of Chad and the Central African Republic largely correspond to the di- 
vide separating the Lake Chad and Congo River drainage areas to the west from the Nile River 
drainage area further east. The southwestern border of Chad and the western border of the Cen- 
tral African Republic roughly reflect the divide separating the same Lake Chad and Congo River 
drainage areas from the drainage areas of the Niger River and some smaller rivers flowing into 
the Gulf of Guinea. The shape of the divides themselves naturally results from the relief of the 
area. 
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Fig. 12: A relief map of Africa: the arrow highlights the correlation between the topography and 
the southeastward bent in the CFA around southern Chad — A and B mark plausible primary in- 
teraction zones between the CFA and the neighboring regions in South Sudan and equatorial 
Central Africa 


5.2 The Central Focal Area: the core issue 


In Figure 8, which is a spatial interpolation plot with kernel smoothing, and Fig- 
ures 10 and 11, which are visualizations of two different GAMs, the CFA is charac- 
terized by a region of high prominence in southern Chad and the Central African 
Republic (as reflected by its darker shading in Figure 8 and lighter colour in Fig- 
ures 10 and 11). Therefore, it may be tempting to interpret the region in southern 
Chad and the Central African Republic as the core or hotbed of the CFA, with all 
the obvious historical implications that such an interpretation would entail. At 
the same time, this region of high prominence is absent in Figure 9, which uses a 
different spatial interpolation method, viz., inverse-distance weighted smooth- 
ing. Similarly, the same region looks anything but prominent in Figure 6, which 
shows the distribution of the languages with CFNMs and their spatial intensity. 
Clearly, care should be exercised when interpreting the relevance of this region 
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of high prominence within the CFA. In fact, I believe that the prominence of the 
region in southern Chad and the Central African Republic is epiphenomenal and 
that this region is not the core of the CFA. A much better candidate for this role is 
the Benue River corridor going from central Nigeria through northern Cameroon 
into southern Chad. The apparent prominence of this region within the CFA stems 
from a combined effect of a number of factors, primarily relevant to its southern 
part in the Central African Republic. On the one hand, we have the geography of 
the region, which makes it a kind of cul-de-sac with lots of marshy and seasonally 
flooded areas. On the other hand, we have the linguistic and population history 
of the region, which appears to be characterized by strong founder effects. 

When looking closer into this part of the CFA, it is first important to note that 
this region of Central Africa is very homogenous with respect to the feature CFNM 
with most (if not all) languages having canonical CFNMs (type 4). At the same 
time, this region of Central Africa is both rather homogenous linguistically and 
rather sparsely populated. It is sparsely populated both in absolute terms, as be- 
comes obvious from the low density of populated places in this region in Figure 
13, and in terms of the languages spoken there. The latter fact is apparent in Fig- 
ure 5, which shows the distribution of the languages of the sample. 


Fig. 13: Populated places in NSSA (based on the data from GeoNames.org) - the oval centered 
on the Central African Republic roughly indicates the relevant southern part of the CFA 
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To see how the sparse language density can be relevant for the apparent promi- 
nence of the region in question, recall the discussion of some of the possible ar- 
tefacts of the visualization by means of spatial interpolation in Section 5.1. As for 
the linguistic homogeneity, this region is occupied by a small number of linguis- 
tic groups, all of which are rather shallow, viz., Gbaya-Manza-Ngbaka, Sere- 
Ngbaka-Mba, Banda, Ngbandi-Mongoba-Kazibati, Zande and Western Sara- 
Bongo-Bagirmi. All but the last group have traditionally been classified together 
under the label Ubangian, although more recently Gbaya-Manza-Ngbaka was ex- 
cluded from this group. Western Sara-Bongo-Bagirmi is a branch of the Sara- 
Bongo-Bagirmi family, which itself is a subgroup within Central Sudanic. The 
map in Figure 14 illustrates the location of the first five groups under their older 
grouping as Ubangian and the map in Figure 15 shows the distribution of the 
Sara-Bongo-Bagirmi languages as a whole. 


"LES LANGUES OUBANGUIENNES— Medien peer em 
1. 


2. 
3. 
LE 
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Fig. 14: Ubangian groups (Sere-Ngbaka-Mba, Banda, Ngbandi-Mongoba-Kazibati, Zande) and 
Gbaya-Manza-Ngbaka (formerly also classified as Ubangian) (Mofiino 1988) 
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Fig. 15: Sara-Bongo-Bagirmi languages (Boyeldieu 2006) — the Western Sara-Bongo-Bagirmi 
languages are in pink (or light grey) (e.g., Kaba), dark grey (e.g., Gula or Gele) and brown (or 
normal grey) (e.g., Yulu) 


The degree of internal diversity within all the six groups (Gbaya-Manza-Ngbaka, 
four Ubangian groups and Western Sara-Bongo-Bagirmi) is also rather low. Some 
of these groups could have just as well been referred to as languages without 
much exaggeration. Furthermore, within Ubangian, at least based on lexical sim- 
ilarity (cf. Boyeldieu and Cloarec-Heiss 1986; Mofiino 1988:19), Sere-Ngbaka-Mba 
and Banda can be said to be rather closely related and probably also form one 
group with Ngbandi-Mongoba-Kazibati, with only Zande left out as not being 
transparently related to the other three Ubangian groups. 

Another important point is that, beside being rather shallow and having a 
low level of internal diversity, most, if not all, of these groups are very likely to 
have moved into this region of Central Africa relatively recently and it is only 
upon their entering this region that most speciation events within these groups 
have occurred. For instance, Figure 16 shows the reconstructed migration routes 
of the Sara-Bongo-Bagirmi populations, indicating the Proto Sara-Bongo-Bagirmi 
homeland in what is now South Sudan just outside of our region in question, the 
Proto Western Sara-Bongo-Bagirmi diversification node in the northeast of the 
Central African Republic inside the region in question and an important node of 
further diversification within Western Sara-Bongo-Bagirmi around the border be- 
tween Chad and the Central African Republic, viz., the Proto Sara node. Similarly, 
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as can be seen in Figure 14, both the Sere group of Sere-Ngbaka-Mba and a part 
of Banda are still spoken in South Sudan approximately in the same area as the 
Proto Sara-Bongo-Bagirmi homeland in Figure 16. Furthermore, the available ev- 
idence suggests that both the remaining part of Banda and the Ngbaka-Mba 
group of Sere-Ngbaka-Mba have also moved into the region in question from ap- 
proximately the same area in South Sudan (e.g., Rombi and Thomas 2006: 22 for 
Ngbaka-Mba; Tisserant 1930: 8—10 for Banda). 


b > P L'expansion sara-bongo-baguirmienne 
Wye Foyers *SBB / *OCC / *SARA 
a . ape 
t d 
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Fig. 16: The reconstructed migration routes of the Sara-Bongo-Bagirmi populations (Boyeldieu 
2006) (*SBB is the Proto Sara-Bongo-Bagirmi homeland; *OCC the Proto Western Sara-Bongo- 
Bagirmi diversification node; *SARA the Proto Sara diversification node, a major subgroup 
within Western Sara-Bongo-Bagirmi) 


Summing up, the region in question appears to have been occupied relatively re- 
cently by six shallow and rather homogenous linguistic groups and, of these six 
groups, at least three are transparently related to one another and may also be 
more distantly related to yet another group out of these six, which reduces the 
whole number of the different linguistic groups involved to three. These groups 
have moved into the region from outside. Moreover, all but one group have most 
likely migrated out of the same general area in what is now South Sudan, which 
makes it likely that they had been in close contact with each other even before 
entering the region in question. The only probable exception is Gbaya-Manza- 
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Ngbaka, which is more likely to have entered the region from the north some- 
where in southern Chad and closer to the majority of the remaining languages of 
the CFA. These groups have then undergone further diversification upon entering 
the region. As the proto-language of Gbaya-Manza-Ngbaka is likely to have al- 
ready had CFNMs, we end with a very high probability that the remaining two (or 
maximally three) proto-languages coming out of the same area simply happened 
to have CFNMs from the start or happened to acquire them upon entering the re- 
gion. In such a situation, it is easy to imagine how the respective region may have 
easily become as homogenous as it is with respect to the relevant feature due to 
founder effects. In fact, the languages spoken in the region in question are rather 
homogenous with respect to a number of features that are otherwise highly unu- 
sual typologically, such as the high lexical frequency of labial-velar stops (Idia- 
tov and Van de Velde 2016, 2018), the prominent presence of labial flaps (Olson 
and Hajek 2003) and the use of possessee-like qualifier constructions (also 
known as dependency reversal) (Van de Velde 2012, 2013: 233-234). 

In view of the arguments presented above, it is extremely unlikely that the 
prominent presence of canonical CFNMs in this region of Central Africa attests in 
any way to the hypothesized historical role of this region as a would-be primary 
core of the CFA. At the same time, as will be discussed in Section 5.3, this region 
of Central Africa with its prominent presence of CFNMs must have served as the 
source for the spread of the feature CFNM among Bantu languages further south 
in the Congo River corridor and the north of the Democratic Republic of Congo. 
Given the overall orientation of the CFA and its population dynamics as driven by 
the ecology and geography of the area, it is most likely that the primary historical 
core of the CFA is situated immediately to the northwest of the Central African 
Republic along the Benue River corridor going from southern Chad through 
northern Cameroon into central Nigeria. This is basically the high prominence 
region on the spatial intensity plot of languages with CFNMs in Africa in Figure 
6. The region along the Benue River corridor is densely populated both in terms 
of people (cf. Figure 13) and languages (cf. Figures 5 and 6). Moreover, the lin- 
guistic landscape of this region is highly fragmented and characterized by a lot 
of deep linguistic diversity, which forms a stark contrast with the Central African 
Republic areas of the CFA further to the southeast that I have discussed first. 


15 This does not mean, of course, that these features need to be restricted to the languages of 
the region. 
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5.3 Optional and/or restricted CFNMs: grammatically non- 
canonical and areally peripheral 


From the perspective of language change dynamics, optionality and construc- 
tional restrictions of CFNMs are likely to characterize either innovated or disap- 
pearing markers, depending on what the direction of change is. Detailed compar- 
ative studies would be required to determine the direction with certainty. Yet, 
given what I know about the languages in question, my strong impression is that 
such CFNMs are more often innovations than retentions from older stages on their 
way out. 

From an areal typological perspective, languages with optional and con- 
structionally restricted CFNMs, viz., languages with scores 1, 2 and 3 in the col- 
umn “obligatoriness highest" in Table 1 (which I will refer to as types 1, 2, 3), are 
expected to be located mostly toward the periphery of the area of languages with 
CFNMs. This expectation is indeed borne out, as can be seen by comparing the 
spatial distribution of such languages in Figure 17 with that of all the languages 
having CFNMs in the sample in Figure 6 (cf. Section 3). Remarkably, languages of 
types 2 and 3 happen to be in almost complementary spatial distribution, with 
noticeable overlaps only in the Congo River corridor (cf. Section 3) and in south- 
western Cameroon, which are also the same two regions where one finds lan- 
guages of type 1. Moreover, languages of type 2 are almost all Bantoid and, within 
that group, they are almost all Narrow Bantu languages whereas the languages 
of type 3 are much more genetically diverse, which suggests that there is some- 
thing different about Bantoid and especially Narrow Bantu languages. Typologi- 
cally, Narrow Bantu languages are indeed known to differ in many respects from 
the languages in more northern parts of Sub-Saharan Africa, to most of which 
they are actually related (cf. Clements and Rialland 2008; Güldemann 2008). 
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Fig. 17: Languages with optional and constructionally restricted CFNMs (see Table 1 for the 
meanings of the values 1, 2, 3 in the column “obligatoriness highest”) 


A possible explanation for the observed distribution of type 2 is that CFNMs 
in Bantoid and especially Narrow Bantu languages tend to develop through 
somewhat different pathways than elsewhere, the pathways that are less likely 
to lead to constructional restrictions but are more likely to result in optional 
CFNMs, either in the sense that they are clause-final but optional in that position 
or that they are optionally clause-final. As described by Devos and van der Au- 
wera (2013), “recurrent sources for post-verbal negative markers [including 
CFNMs] in Bantu languages are locative pronouns, possessive pronouns and neg- 
ative (answer) particles”, which indeed seem to be rarely attested as sources of 
CFNMs in more northern parts of Sub-Saharan Africa. As mentioned with respect 
to the negation marker bo in the Bantu language Nzadi in Section 2.5, which is 
optionally clause-final, possessive pronouns as a source of post-verbal negation 
markers are, for instance, unlikely to originate in the clause-final position. To 
some extent, the same is true for locative pronouns in Bantu. Negative answer 
particles as a source of post-verbal negation markers, although likely to originate 
in the clause-final position, are unlikely to be constructionally restricted. 
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The frequent optionality of CFNMs in Bantu must have much to do with their 
relatively young age. The relatively recent innovative character of CFNMs in 
Bantu is confirmed by their restricted distribution within Bantu and important 
variation in their forms across Bantu, which starkly contrasts with the relative 
uniformity and almost universal obligatory presence of the older pre-verbal ne- 
gation markers (cf. Kamba Muzenga 1981; Güldemann 1999; Devos and van der 
Auwera 2013). Moreover, while the forms of the older pre-verbal negation markers 
can be reconstructed to Proto Bantu, they cannot be provided with an etymology 
other than a negation marker (Kamba Muzenga 1981). At the same time, CFNMs 
cannot be reconstructed to Proto Bantu and, when their etymology can be estab- 
lished, they often originate in elements that are not negation markers. The only 
noticeable exception are CFNMs originating in negative answer particles. 

Another important factor contributing to the frequent optionality of CFNMs 
in Bantu is that, typically, clause-final markers as such are not a prominent mor- 
phosyntactic feature of Bantu languages and, in this respect, they clearly differ 
from languages of northern Sub-Saharan Africa (cf. Idiatov 2012a). Whereas the 
prominent presence of clause-final markers in the morphosyntax of languages of 
northern sub-Saharan Africa would be propitious for the upgrade of innovated 
CFNMs from optional to obligatory status, such a pull factor is generally lacking 
in Bantu languages. 

Within Bantu, the Congo River corridor is clearly the focal area of the inno- 
vation of CFNMs. This is suggested already by the observation in Figure 17 that 
Bantu languages of type 3 are basically confined to this region and that the con- 
centration of Bantu languages of type 2 is also highest in the same region. The 
importance of the Congo River corridor becomes even more obvious when we in- 
clude Bantu languages of type 4, i.e., Bantu languages with obligatory and con- 
structionally unrestricted CFNMs. Thus, as can be seen in Figure 18, Bantu lan- 
guages with more canonical CFNMs are equally concentrated in the Congo River 
corridor. 
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Fig. 18: Bantu languages with CFNMs (see Table 1 for the meanings of the values 1, 2, 3, 4 in the 
column “obligatoriness highest”) 


Furthermore, we can observe in Figure 18 two weak stretches of Bantu languages 
with CFNMs that appear to be linked to the northern and southern ends of the 
Congo River corridor and both going southeast, one to the north and the other to 
the south of the Congo River basin. Of these two secondary prominence zones, 
the southern one is clearly historically an offshoot of the Congo River corridor 
whereas the northern one must share its origin with the Congo River corridor in 
the CFA, situated further north in the Central African Republic (cf. Figures 8 and 
11 and Section 5.2), as schematically illustrated in Figure 19. Admittedly, we can- 
not completely rule out the possibility that optional CFNMs in Bantu languages 
in East Africa have evolved independently. 
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Fig. 19: The suggested direction of spread of the use of CFNMs in Bantu from a major focal area 
of CFNM use in northern Central Africa into the Congo River corridor and the two secondary 
prominence zones 


The emergence of both the Congo River corridor and the two secondary promi- 
nence zones must result from relatively recent population and/or language move- 
ments out of Central Africa. They clearly occurred much later than the original 
Bantu expansion in and around the Congo River basin. In this respect, compare 
the areal pattern of the distribution of CFNMs in Bantu with the Bantu expansion 
route reconstructed by Grollemund et al. (2015) and reproduced in Figure 20. The 
comparison makes it clear that the southwest expansion of the use of CFNMs in 
the Congo River corridor went in the direction opposite to the original route of 
Bantu expansion in the northern half of the Congo River basin. The northern sec- 
ondary prominence zone does not correspond to any original route of Bantu ex- 
pansion in that area from the northern Democratic Republic of Congo. The south- 
ern secondary prominence zone partially corresponds to an original route of 
Bantu expansion in that area, yet it could not have formed before the emergence 
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of the Congo River corridor zone and therefore could not have coincided in time 
with this part of the Bantu expansion route. 


Fig. 20: Bantu migration route reconstructed by Grollemund et al. (2015) on consensus tree by 
using geographical locations of contemporary languages and connecting ancestral locations 
by straight lines (the true route will differ) (numbered positions correspond to major diversifi- 
cation nodes on the consensus tree; the curved dashed line indicates the suggested migration 
route through savannah corridors; lighter (green) shading corresponds to the delimitation of 
the rainforest at 5,000 BP.; the darker (green) shading corresponds to the delimitation of the 
rainforest at 2,500 BP) 
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6 Concluding remarks 


The synchronic patterns that we discover are necessarily a product of language 
change as it evolves in time and space. This paper provides an analysis of spatio- 
temporal language dynamics in Sub-Saharan Africa with respect to the feature 
CFNM. It is my strong conviction that the most plausible account of synchronic 
patterns can only be gleaned by casting your net wide to catch more of the syn- 
chronic diversity, rather than by trying to reduce it. Furthermore, when analyzing 
areal patterns, it is also important to consider together the languages that have 
the feature under investigation and the languages that do not have it. In terms of 
spatial analysis, methods such as spatial interpolation and generalized additive 
modeling (including their mixed extensions) provide particularly valuable re- 
search tools. 

A question that one is often tempted to ask when doing areal typology is 
which linguistic group of the ones present in the area where the feature is prom- 
inent could have been the primary vector of the feature. Yet, this question can 
only have a fully meaningful answer if we know that no linguistic groups have 
disappeared from the scene without traces since the emergence of the feature in 
the area. Unfortunately, in a region such as Sub-Saharan Africa, we cannot be 
sure of that. In fact, we can be quite sure of the contrary. Furthermore, the fact 
that all members of a certain linguistic group carry the feature and are spoken 
inside the area where the feature is prominently present cannot prove that this 
linguistic group is the primary vector of the feature. 

For instance, Dryer (2009: 346) entertains two possible scenarios with respect 
to his core area of VO&VNeg languages in Central Africa, which, unsurprisingly, 
largely coincides with our CFA of the feature CFNM." The first scenario is that the 
feature VO&VNeg originates in Chadic languages, because the feature is perva- 
sive in Chadic languages and all Chadic languages are spoken inside the relevant 


16 For instance, see Kleinewillinghófer (2001) on Jalaa, an apparent linguistic isolate in north- 
eastern Nigeria, the area that is particularly relevant for the feature CFNM, which went extinct 
fairly recently and for which only some lexical data could be collected from rememberer speak- 
ers. 

17 Itis not very surprising because Dryer (2009) restricts his study to negation markers that are 
words. Although not all post-verbal negation markers that are words are also clause-final, 
CFNMs are almost always analyzed as words precisely because of their clausal orientation and, 
therefore, at least canonical CFNMs would always be classified as post-verbal negation markers 
in Dryer's (2009) typology. 
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area. Yet, Chadic languages are also typologically quite different from their dis- 
tant Afro-Asiatic relatives, most of which equally do not carry the relevant fea- 
ture. In other words, we cannot know whether the feature can be reconstructed 
to Proto Chadic as independent of the current participation of Chadic languages 
in the area in question. The second scenario is that the presence of the feature in 
Chadic is due to a substrate influence from “Nilo-Saharan” and that the substrate 
influence could have such a pervasive effect on Chadic languages because of the 
relatively small size of the region populated by speakers of Chadic languages. The 
Nilo-Saharan group that is both spoken in the same area as Chadic languages and 
is classified positive for the relevant feature by Dryer (2009: 311) is Western Sara- 
Bongo-Bagirmi. However, as laid out in Section 5.2, Western Sara-Bongo-Bagirmi 
languages are one of the recent newcomers in the area. That is, even if there was 
a substrate influence on Chadic, which is actually quite plausible given the typo- 
logical differences between Chadic and its Afro-Asiatic relatives, we cannot know 
what that substrate was, nor can we know whether the pervasive presence of the 
feature in Chadic can be attributed to this substrate. 

The considerations above equally apply to the role of Chadic and other simi- 
larly homogenous groups (such as Gbaya-Manza-Ngbaka and Zande) within our 
CFA of the feature CFNM. While a homogenous distribution of the feature within 
a given linguistic group (all members of the group carry the feature and are spo- 
ken inside the area) does not tell us much on the spatio-temporal dynamics of the 
feature, a much more informative signal is usually provided by groups that are 
diverse with respect to the feature, with members both inside and outside the 
area, especially when it can be complemented by independent information on 
language and population movements. Thus, in the case of the feature CFNM, var- 
ious Niger-Congo groups are spoken around the Benue River corridor in the CFA, 
as well as further to the west in the WFA. At the same time, many Niger-Congo 
groups are also spoken outside of the CFA and the WFA. We can reasonably hy- 
pothesize that the latter Niger-Congo groups have lost the feature CFNM when 
they moved outside of the area, as when entering the forest zone along the coast 
of the Gulf of Guinea, for instance, due to some substrate influence (just as, for 
instance, they developed high lexical frequency of labial-velars in the same 
coastal regions; cf. Idiatov and Van de Velde 2016, 2018), or when some groups 
lacking the feature entered the area from outside. For instance, as discussed in 
Section 5.1, the former type of loss is likely to be the case for the Yoruboid lan- 
guages in the coastal gap in southwestern Nigeria between two coastal exten- 
sions of the CFA while the latter type of loss is likely to be the reason behind the 
emergence of the major discontinuity separating the WFA from the CFA. At the 
same time, it is rather unlikely that the feature CFNM should be reconstructed to 
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higher nodes in the Niger-Congo tree, not even to the Proto Benue-Congo node. A 
major counterargument to such a deep reconstruction is presented by the general 
lack of CFNMs in Southern Bantoid languages (with the most noticeable excep- 
tion of the Bantu languages of the Congo River corridor but, as discussed in Sec- 
tion 5.3, this is clearly a recent development). Within Benue-Congo, their super- 
ordinate group, and within Niger-Congo in general, Southern Bantoid languages 
and especially Bantu languages are generally considered archaic in their typo- 
logical profile (e.g., Hyman 2011). 
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Ekkehard König 
Definite articles and their uses 


Diversity and patterns of variation 


Abstract: The goal of this paper is to provide the basic outline of a typological 
study of definite articles, on the basis of both formal and notional criteria, with a 
focus on European languages. In contrast to earlier contributions to this topic and 
to recent, more comprehensive typological studies, more attention will be paid to 
(i) the problems of providing a clear semantic basis for the comparison and (ii) 
the reconstruction of plausible historical developments, following the leads of 
Greenberg (1990) and others. In addition to developing a more fine-grained ty- 
pology of definite articles, the paper will also show that, even in the restricted 
area of Europe, we find a remarkable diversity in the meaning and use of definite 
articles. 


Keywords: formal and notional criteria for comparison, types of definite articles, 
diversity of forms and use 


1 Introduction 


One of the basic assumptions of structuralism (cf. Lazard, 2012), viz., the assump- 
tion of the sign as an inseparable union of acoustic image and concept (signifiant 
versus signifié), has often been abandoned in linguistics, especially in compara- 
tive and typological studies. Only through this change in the theoretical founda- 
tions of linguistics has it become possible to base the comparison of languages 
not only on formal but also on notional criteria and to compare the different ways 
in which specific meanings are encoded in languages. In the domain under dis- 
cussion, i.e., definite articles, we can base a comparison on suitable formal crite- 
ria and investigate different meanings and uses of comparable forms (e.g., con- 
stituents of noun phrases or nouns, preceding or following a noun) or on notional 
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criteria like “definiteness” and investigate the different ways of encoding this no- 
tion. In further illustration of the second approach, let me briefly mention that 
the list of formal properties that have been enumerated as markers of definiteness 
includes the following: word order, sentential stress, adnominal pronouns 
(Louagie and Verstraete 2015), case, number marking, aspect and topic markers. 
Both of these approaches and their combination require, however, clear defini- 
tions and explications of their basic terms “definite article” and “definiteness”. 

The goal of this paper is to provide the basic outline of a typological study of 
definite articles, on the basis of both formal and notional criteria, with a focus on 
European languages. In contrast to earlier contributions to this topic (Kramsky 
1972; Nocentini 1996) and to recent, more comprehensive typological studies 
(Dryer 2005, 2014), more attention will be paid to (i) the problems of providing a 
clear semantic basis for the comparison and (ii) the reconstruction of plausible 
historical developments, following the leads of Greenberg (1990), Hawkins 
(2004) and Heine and Kuteva (2006). 


2 Definition, identification, establishing 
comparability 


Definite articles have traditionally been identified and described for modern Eu- 
ropean languages (Germanic, Romance, Celtic, Basque, Hungarian, Bulgarian) 
and for Semitic languages. Moreover, emergent articles can be found in the pe- 
riphery of Europe, i.e., Finnish (Chesterman 1991), Sorbian and Polish (Heine and 
Kuteva 2006). In fact, definite articles and their contrasts to indefinite ones are 
often considered to be one of the most characteristic features of Europe as a lin- 
guistic area (cf. Haspelmath 2001). The relevant grammatical category was ab- 
sent, however, in earlier stages of Indo-European languages, with the exception 
of Classical, post-Homeric Greek. Typological studies have recently shown that 
something like definite articles is also found elsewhere (in Central Africa, Meso- 
America and the Pacific). 

On the basis of his rich collections of data, Dryer (2005; 2014) has provided a 
comprehensive description of the diversity found in the forms and uses of definite 
articles in the world. In one of his contributions to the World atlas of language 
structures, he identifies definite articles cross-linguistically on the basis of the 
following syntactic criteria: they are free or bound morphemes, constituents of 
noun/determiner phrases, derived but different from adnominal demonstratives, 
typically forming an opposition with indefinite articles, and they cannot occur on 
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their own (i.e., they cannot be heads in the traditional sense of the term) (Dryer 
2005: 154). These formal criteria are clearly applicable to the invariant pre-nomi- 
nal article in English (the), to the definite articles in French, which inflect for gen- 
der (le, la) and number (le, les) and to the definite articles in German, which in- 
flect for gender (der, die, das), number (die) and case (der, des, dem, den and so 
on). They also apply to the post-posed articles of Scandinavian (-en), Bulgarian (- 
ta, -to, -te), Romanian (-ul, -a and so on) and Basque (-a, -ak). 

Dryer's semantic criteria, by contrast, are much more general and less restric- 
tive: definite articles encode “definiteness” and have at least an anaphoric use, 
i.e., they can have the same referent as an antecedent found in a preceding sen- 
tence or text. This definition and the typology it underlies have been criticized as 
being too broad and too vague and as being therefore applicable to languages 
which do not meet the criteria generally subsumed under the term *definiteness", 
such as “uniqueness”, “familiarity” and “inclusiveness” (cf. Davis, Gillon and 
Matthewson 2014). In a more elaborate follow-up article to the brief general 
sketch required by the World atlas of language structures format, Dryer (2014) ex- 
plains that he wanted to uncover a wider diversity in the use of definite articles 
than is presented in earlier descriptions and to show that languages with a binary 
contrast between definite and indefinite articles of the sort found in English are 
uncommon outside of Europe and the Middle East. 

As already mentioned, the main focus of my paper is on European languages. 
Its goal is to establish more solid semantic foundations for a comparative study 
of definite articles and to reconstruct the development of these expressions on 
the basis of available data and plausible processes of semantic change and gram- 
maticalization. The implementation of these goals will be a first step toward a 
more fine-grained typology of definite articles and ultimately provide a better ba- 
sis for extending the scope of such a typology to the specific articles of Polynesian 
languages (cf. Mosel and Hovdhaugen 1992; Moyse-Faurie 1997) and other sys- 
tems discussed in Dryer (2014). Moreover, it will also be pointed out that, even in 
the restricted area of Europe, we find a remarkable diversity in the use of definite 
articles. 

The concept “definiteness” that is used in the label for the relevant class of 
functional expressions is by no means a basic or primitive concept and therefore 
in need of explication. Using this label in the analysis of articles does not say 
much more than that an expression of a specific language is translated by the 
definite article the in English. Various attempts to explicate this notion in terms 
of more elementary ones can be found in philosophical studies (Russell 1905; 
Frege 1984; Neale 1990), in linguistic studies such as Hawkins (1978) and Abbott 
(2004) and, more recently, in formal semantic studies such as Elbourne (2010; 
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2012), Gisborne (2012) and Coppock and Beaver (2015). This is not the place for a 
detailed discussion of the relevant formalisms. So let me just point out that the 
more elementary notions used in the relevant explications are the following: 
“uniqueness”, “salience”, “existence”, “identifiability” and “inclusiveness”. Of 
these elementary notions, “uniqueness” is the most important one. Whenever we 
use a definite article, as in (1), we presuppose that reference is made to an object 


or entity that is unique and therefore clearly identifiable in a given context. 


(1) a. Could you pass me the salt? 
b. Let’s have a look at the church! 
c. The book I bought yesterday is on the short-list for a prize. 


An additional criterion of salience is important for those cases where several ob- 
jects meet the description ‘church’ in (1b) or ‘book I bought yesterday’ in (1c). In 
those cases, it has been shown that interlocutors, even at an early age, look for 
an additional property that distinguishes one entity from the others.’ Further- 
more, in nearly all cases where a unique object is referred to, there is also a pre- 
supposition of existence. Nevertheless, it is possible to construct examples where 
this presupposition is not met, like in (2), where a book has been written by two 
authors so that there is no “single author” (Coppock and Beaver 2015). 


(2 Houellebecq is not the only author of La vie en rose. 


The criterion of inclusiveness or exhaustivity is relevant for plural contexts, 
where the definite article is quite similar to universal quantifiers like all. A re- 
quest like (3) would generally be meant to include all the cushions outside. 


(3) Itis raining. Could you bring in the cushions! 


Since plural contexts pose additional problems, we will not consider them any 
further in what follows. Nor will we consider such quantificational uses as are 
exemplified by (4), where the definite article is in the scope of and bound by the 
quantifier each. 


1 Theadditional pragmatic criterion of salience does not provide the required solution for coun- 
terexamples to the uniqueness requirement like he is standing at the corner of the intersection 
(Coppock and Beaver 2015: 394).The relevant examples have been discussed in a wide variety of 
studies and, apparently, constitute a limited but systematic set of counterexamples to the 
uniqueness requirement. 
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(4) The mother of each girl was there when their bus left. 


For all of the concepts discussed above, there are precise formal explications in 
the relevant literature — in some cases, controversial in their details. In summary 
and without going into the details of a rich literature and complex discussion, we 
can say that it is the presupposition of uniqueness that is the most important in- 
gredient of the meaning of definite articles. This assumption of uniqueness guar- 
antees that the referent is identifiable for the interlocutor. In the terminology of 
pragmatics, more specifically in the view of Relevance Theory (Sperber and Wil- 
son 1996), definite articles “come with a guarantee of identifiability”. 

Given this requirement of uniqueness in a given context, let us now consider 
the various ways in which a context may identify a unique object. The most im- 
portant contextual types are described in the following list: 


(5) presupposition of uniqueness and identifiability in a certain context: 
a. identification through the situation of speech or universe of discourse 
(situational use, visibility or general background knowledge) 
b. identification through sufficient description (cataphoric use) 
. identification by the preceding context (anaphoric use) 
d. identification through appeal to personal memory, partial description 
(recognitional use, emploi mémoriel) 
e. identification by association with an identifiable entity (associative use) 


[e] 


These different ways of contextually identifying the referent of a definite article 
can be illustrated by the following examples: 


(6) a. Pass me the salt. Today the sun is shining. The Pope will come to Paris. 
b. The book I bought yesterday is under discussion for the Nobel Prize. 
c. Somebody stole my bike yesterday but they have already found the thief. 
d. You remember the restaurant we went to recently. That is where I found a 
wallet. 
e. We laid out the picnic. The coffee was still warm. 


These are the five context types most frequently distinguished in the literature 
(cf. Hawkins 1978; Lóbner 1985; Himmelmann 1998; De Mulder and Carlier 2011). 
In (5) and (6), they are listed in the order of their historical development. The most 
basic way in which a referent might be unique and thus identifiable is its pres- 
ence in the context of speech, as in (1a) and (6a). A slight extension of this domain 
of identification then leads to referents that are unique in a universe of discourse: 
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the Pope, the sun, the government, the weather and so on? We know that there 
are many suns in the universe but there is only one that is of interest in the con- 
text of our weather. A dedicated militant of a political party will simply speak of 
*the party", whenever she makes reference to her own group and can even give 
that identification a high scalar value by stressing the definite article (THE [Oi:] 
party). In the anaphoric and cataphoric uses of the definite articles, the referents 
are given in the co-text, in the preceding co-text for anaphoric reference, as in 
(6c), and in the following co-text for cataphoric reference, as in (6b). Note that 
definite descriptions, i.e., the identification of a referent through a description of 
its salient properties, is simply regarded here as an instance of cataphora. The 
recognitional use (emploi mémoriel) requires a search in the memory of interloc- 
utors rather than in the co-text or the non-verbal context. According to Himmel- 
mann (1997), this use of demonstratives has played the decisive role in the devel- 
opment of the definite article. A characteristic feature of this use is the explicit 
appeal to the hearer to search for the relevant context in his/her memory. Finally, 
the associative use requires that a referent is identifiable through its association 
with another one given in a context (cf. Clark and Marshall 1981). There are many 
relations between entities that provide such a bridge: part-whole, as in (6e), or 
action-instrument, as in examples like (7). 


(7) Our neighbor was killed. The weapon was found two days later. 


3 Origin and historical development 


Let us now consider in how far the preceding ordering of relevant contexts for the 
use of definite articles has historical relevance and squares with the historical 
evidence provided by relevant data. There is clear historical evidence and general 
agreement that definite articles — at least in Europe — derive from adnominal 
demonstratives. This development is a younger phenomenon, only (post-Ho- 
meric) Greek and Old Norse had articles among the ancient languages of Europe. 
Right from the start we must admit, though, that the available historical infor- 
mation is limited and does not enable us to clearly reconstruct and document the 


2 One of the reviewers pointed out that absolute unica or entities conceived as such (church, 
sun) could appear without articles in older stages of European languages. It is quite plausible to 
assume that, in those stages, the relevant nouns were used as proper names. 
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relevant processes of granmaticalization so that we partly have to rely on syn- 
chronic evidence. 

In his frequently cited paper, Greenberg (1990) distinguished three stages in 
the development of definite articles from demonstratives, as in Table 1. 


Tab. 1: Greenberg's (1990) stages of the development of definite articles 


Stage 0 Stage 1 Stage 2 Stage 3 


distal demonstratives definite article general article noun marker 


In Hawkins (2004) and Heine and Kuteva (2006), this schema is further elabo- 
rated to include four stages. In our elaboration of these schemata, we will distin- 
guish five stages, including the use of demonstratives as a separate stage and the 
development of specific articles in Polynesian languages as a further develop- 
ment, whose details are not very clear, however. The two hierarchies in (8) and 
(9) roughly characterize the co-evolution of form and meaning in the historical 
development of definite articles, exclusive of Greenbergs final stage. 


(8) demonstratives > strong article > weak article > generic article > specific ar- 
ticle 


The labels listed and ranked in (8) correspond to the uses in (9).? 


(9) exophoric > endophoric (anaphoric/cataphoric) > associative > generic/ab- 
stract > specific 


Let us now take a detailed look at labels and uses, at the relevant historical de- 
velopments and at individual expressions manifesting a particular use.’ It is an 
essential property of demonstratives that they have an exophoric, pointing and 
contrastive use. They can be used with a pointing gesture and identify entities in 


3 What the chains described in (8) or (9) primarily indicate is the development of a new use. 
Languages differ, however, in whether the use to the left is kept or not in addition to acquiring 
the new one, i.e., the relevant articles differ in their degree of grammaticalization. In contrast to 
English, definite articles can still be used exophorically in German and the weak articles of Fri- 
sian can still be used for unique entities (cf. Ebert 1970). 

4 A more detailed discussion and documentation of the relevant semantic changes, of contro- 
versies and relevant data is presented in De Mulder and Carlier (2011). 
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contrast to others (e.g., I want THIS book [gesture] and not the other). Demonstra- 
tives of all syntactic and semantic types generally have anaphoric and cataphoric 
uses in addition to the exophoric one but what is maintained in the two endo- 
phoric uses of demonstratives is the contrastive element (e.g., yesterday I bought 
a book and THIS book I will give to my mother), which is no longer expressed by 
definite articles like English the.’ So in the first stages of their development to 
articles, demonstratives do not only lose their exophoric (gestural) use but also 
their contrastive meaning. This is exactly the reason why an anaphoric use of an 
adnominal demonstrative is not a sufficient condition for using the term “definite 
article” for the relevant expression. 

There is good evidence for Romance languages that it is typically the distal 
demonstrative (Latin ille) that gives rise to definite articles (French le/la/les). 
Given, however, that demonstratives invariably also have an anaphoric use, it 
should not come as a surprise that some articles seem to be based more clearly 
and exclusively on this anaphoric use. This is the case not only for those articles 
derived from Latin ipse (es, sa in Western Catalan and Sardinian) but also for 
combinations like ledit ‘said’ in Middle French and all combinations of verbs of 
saying with demonstrative elements (German der erwühnte/besagter ‘said’, Eng- 
lish the aforementioned) or from verbs of saying alone (cf. De Mulder and Carlier 
2011). Mention should also be made in this connection that the deictically neutral 
demonstrative ce in French manifests the properties of the first stages in the de- 
velopment of a demonstrative to a definite article, i.e., the anaphoric and the an- 
amnestic (recognitional) use (cf. De Mulder and Carlier 2006). 

German and Dutch manifest another interesting, intermediate stage in the 
development of demonstratives to definite articles: reduced forms of the demon- 
stratives, identical to the definite articles, have an anaphoric pronominal use, in 
which they contrast with personal pronouns in the choice of the antecedent they 
relate to. Consider the minimal pairs in German in (10) and (11). 


(10) German 
a. Der Hausbesitzeriinformierte den Handwerker, bevor eri nach Hause ging. 
‘The owner of the house informed the workman, before going home.’ 
b. Der Hausbesitzer informierte den Handwerker;, bevor der; nach Hause 
ging. 
‘The owner of the house informed the workman, before he went home.’ 


5 In contrast to the invariant definite article the in English, its German counterparts still have 
an exophoric use: ich móchte DAS Buch ‘I would like to have that book’. 
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(11) a. Emma liebt ihre; Tochter und ihre; Freunde. 
‘Emma loves her daughter and her (own) friends.’ 
b. Emmai liebt [ihrei Tochter]; und deren; Freunde. 


‘Emma loves her daughter and her (daughter's) friends.’ 


These examples show that the relevant reduced demonstratives are used as pro- 
nouns rather than adnominally. Moreover, they are used anaphorically like per- 
sonal pronouns, with which they contrast in (10) and (11). In contrast to the latter, 
however, they do not pick out the subject as their antecedent but another one, 
viz., one that the personal pronoun could not relate to without ambiguity, as in- 
dicated by the co-indexation and the English translation. What we find here is a 
change from the nominal exophoric use of demonstratives to an anaphoric one, 
with the same formal reductions also found in the definite article, accompanied 
by a loss of the exophoric use only, i.e., without a loss of the contrastive compo- 
nent. This component of contrast now manifests itself in a differential choice, i.e., 
in the choice of an antecedent different from that made by personal pronouns. 
The question of how this choice is best described (obviative, non-subject, non- 
topical antecedent) is a matter of some controversy and will not be pursued any 
further at this point (for a more detailed discussion, cf. Bosch, Rozario and Zhao 
2003; Bosch, Katz and Umbach 2007). 

The relevant step in the change from an exophoric to an anaphoric or cata- 
phoric use is the fact that the search for a unique referent is transferred from an 
external situation to a search in the co-text, either preceding or following. Only if 
this change is accompanied by a loss of the contrastive meaning can we speak of 
an emergent article, however. In contrast to the anaphoric use, the cataphoric use 
provides an identification via a description, i.e., by a relative clause or any other 
nominal adjunct. In German, there are combinations of articles and distal demon- 
stratives which, through their composite forms, clearly illustrate the transition 
from demonstrative to definite article in the cataphoric use (der-jenige ‘he who’, 
die-jenige ‘she who’). These forms are typically employed in cataphoric contexts, 
i.e., with a following relative clause (e.g., diejenigen Studenten, die noch nicht 
bezahlt haben, móchten dies bitte bald tun ‘the students who have not paid yet are 
kindly asked to do so immediately"), even though their anaphoric use is also mar- 
ginally possible. 

The next stage in the development of definite articles involves a major step 
in the availability of a context for identification, from an external, situational or 
textual context to a more abstract context of association, of memorizing or of gen- 
eral availability in a universe of discourse. It is here that we find the associative 
and recognitional use (emploi mémoriel) of articles, as well as those cases where 
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the cultural or local context provides a unique referent. The associative use, as in 
(6e), is often regarded as the crucial step in the development of a definite article, 
since this use is not available for demonstratives. 

In this domain beyond the anaphoric and cataphoric uses of demonstratives 
and articles, some languages (varieties of Low German, Frisian, Scandinavian 
and Standard German) draw a distinction between two types of definite articles: 
astrong one (pragmatic definiteness) and a weak one (semantic definiteness) (cf. 
Heinrichs 1954; Ebert 1970; Lóbner 1985, 2011; De Mulder and Carlier 2011; 
Schwarz 2013, 2014). On the basis of the available literature, the distinction in the 
use of these two definite markers can roughly be described as follows: (i) the 
strong article manifests the situational use, the anaphoric one, including pseudo- 
anaphors (e.g., Bill left — the fool had forgotten his money) and typically also the 
cataphoric one; (ii) the weak article occurs in associative contexts, in reference to 
unique entities in the universe of discourse, as well as in generic contexts. In 
Standard German the regularities are somewhat more complex (cf. Bosch 2013; 
Schwarz 2013; 2014). A distinction of this kind only shows up in connection with 
the fusion of definite articles and prepositions, subject to additional phonological 
constraints (im, am, zum, vom, beim, zur, ins, ans), which is in contrast to the 
strong, non-fused form. The latter manifests the anaphoric and cataphoric uses 
whereas the fused forms typical exhibit the associative use, as well as the use 
related to unique referents given in the abstract universe of discourse, and also 
occur in generic sentences. In minimal pairs like (12), the weak article refers to an 
abstract institution whereas the strong article refers to a specific building, simi- 
larly to the use versus non-use of definite articles in English. 


(12) a. Karl geht noch zur Schule. 
‘Charles still goes to school.’ 
b. Karl ging zu der Schule hin. 
‘Charles went to the school building.’ 
c. Karl istim Gefüngnis. 
‘Charles is (doing time) in prison.’ 
d. Karl ist jetzt in dem Gefüngnis. 
‘Charles is now inside the prison.’ 
e. Ich móchte zur Kirche gehen. 


‘I would like to go to church.’ 
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Examples like (12) also show that weak definites in their fused forms have special 
semantic properties in German, in addition to their contextual restrictions. As 
pointed out by Bosch (2013), they often involve some semantic enrichment over 
and above their local information, they lack the existential presupposition typi- 
cally found in connection with strong definites and the identification of a referent 
is typically not required or possible. Sentences like (12a) and (12c) identify a loca- 
tion but also express the activity associated with that location. And an utterance 
like (12e) could meet with a response pointing out that there is no church in the 
relevant area. 

The next step articles typically take in extending their use is the domain of 
abstract terms and generic sentences. Note that, in contrast to (13) and (14), all 
preceding examples were episodic sentences. French and Italian are clear exam- 
ples of languages where generic sentences and expressions denoting abstract 
terms require the definite article whereas this is only optional in German and un- 
usual in English.5 In the abstract and generic use, reference is made to kinds and 
to abstract entities (cf. Behrens 2005). 


(13) a. La solitude est difficile à supporter. (French) 
b. (Die) Einsamkeit ist schwer zu ertragen. (German) 


c. Loneliness is difficult to live with. 


(14) a. Les faucons sont des oiseaux de proie. (French) 
b. (Die) Falken sind Raubvógel. (German) 
c. Falcons are predator birds. 


The final stage, i.e., the one that leads from definite articles as found in Europe 
to specific articles, is based on highly controversial assumptions and no convinc- 
ing semantic reconstruction has been proposed so far. An extension in the use of 
definite articles to merely expressing specificity is assumed in Hawkins (2004: 
85) and subsequently adopted in Heine and Kuteva (2006: 103). This assumption 
is, however, rejected by Himmelmann (1997: 107), who assumes that specific ar- 
ticles evolve directly from demonstratives. 


6 Oneofthe reviewers pointed out that, in Spanish and French, generic uses of definite articles 
were already observable in the very first texts, although they were less frequent. This greater 
time depth of the generic use offers an explanation for their obligatory nature in current use but 
throws some doubt on the historical scenario assumed above. 
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Let us now look at some of the relevant languages and data. Specific articles 
are found in Polynesian languages and also in a few Melanesian ones (cf. Mosel 
and Hovdhaugen, 1992; Moyse-Faurie 1997). The relevant articles in Polynesian 
languages (le in Samoan and East Futunan, te in Maori and East Uvean), for ex- 
ample, are not only used in contexts which require definite articles in European 
languages but also for the introduction of a discourse referent and for contexts 
where an indefinite article would be used in most European languages. The op- 
position between a specific and a non-specific use of the indefinite article in Eng- 
lish, for example, is expressed by the contrast in (15) between specific le and non- 
specific se in East Futunan. 


(15) East Futunan 
a. E iaile Pilitania e fia 'avaga a malia mo ia. 
‘There is an Englishman Malia wants to marry.’ 
b. E faka ‘amu a Malia ke ‘avaga mo se Pilitania. 
‘Mary would like to marry an Englishman.’ 


(Moyse-Faurie, p.c.) 


For such contrasts, the terms "specific" versus *non-specific" articles are indeed 
appropriate: the article le has lost the uniqueness presupposition but has re- 
tained the existential implication typically associated with the definite article. So 
far, we can still speak of semantic loss or erosion. On the other hand, specific 
articles are also used for emphatic (contrastive) assertion of membership in a 
class in contrast to another, as in (16) from East Uvean. 


(16) East Uvean 
Ko te fafine ia, mole ko te tagata. 
‘It is a woman, not a man.’ 
(Moyse-Faurie 2016: 77) 


An analysis which sees specific articles as further developments of definite arti- 
cles as they are found in Europe has to assume that these articles have not only 
lost their essential property, i.e., the presupposition of uniqueness, but have also 
re-acquired the contrastive use of demonstratives. Another explanation for the 
various uses of specific articles could be sought in the conceptualizations they 
express. We could assume, for example, that classes are conceptualized as indi- 
viduals in Polynesian languages but further evidence would have to be provided 
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for that assumption. On the basis of such an assumption, the use illustrated by 
(16) would express the contrast between two individuals rather than between two 
members of two classes. It is therefore not clear whether specific articles are re- 
ally a further development of definite articles. Given that our perspective is pri- 
marily a typological one, we will not pursue this question any further. 

Analogously to the labels and functions ordered in the two hierarchies in (8) 
and (9), we can also now rank the relevant semantic changes (extension of con- 
texts) as in Table 2. 


Tab. 2: Semantic changes in the development of definite articles 


Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 

exophoric anaphoric abstract context generic, abstract specific 

contrastive cataphoric universe of dis- — non-referential contrastive 
(sufficient de- course non-episodic loss of uniqueness 
scription) extension of con- contexts 


employ mémoriel text from co-text 
(loss of contras- to abstract uni- 
tive and exo- verse of dis- 
phoric use) course 


So far, nothing has been said about a use found in some languages (Greek, Cata- 
lan, Romanian and Albanian; optionally in German and Italian) where definite 
articles are redundant, viz., their use with proper names (cf. Nocentini 1996). In 
Modern Greek, definite articles are not only used before proper names of people 
but also together with place names, with the names for planets, months, holidays 
and years, with generic and abstract terms and even in combination with adnom- 
inal demonstratives. In German and Spanish, the use of articles in combination 
with some of these names is possible but often involves a slight change of mean- 
ing. A pejorative, honorific component or an implicature of familiarity is added. 
Most of these uses are excluded in English and French.’ Interestingly enough, the 
use of definite articles with proper names is excluded in Standard Basque, a lan- 
guage which completely excludes bare nominal phrases in argument positions 
(Etxebarria 2014). For these and other reasons, it cannot be assumed that the re- 
dundant use of definite articles is the result of a further development at the right 


7 In French and Italian, place names (towns, cities) can only combine with the definite article if 
they are followed by a relative clause (e.g., le Paris que j'avais connu il y a vingt ans ‘the Paris that 
Iknew twenty years ago"), unless the article is an integral part of the name (e.g., La Hague). 


178 — Ekkehard König 


end of our scale. The development of totally redundant uses of definite articles 
cannot be analyzed as being part of a wide-spread chain of grammaticalization 
but must be a lateral development. 


4 Syntactic diversity in the use of definite articles 


After this brief sketch of semantic differentiations described in terms of grammat- 
icalization let us now look at some of the most striking parameters of variation in 
the syntax of definite articles. In the available typological surveys (Krámský 1972; 
Lyons 1999; Dryer 2005), the following parameters of variation are invariably 
mentioned: availability of articles, one or two types, free form versus affix, inter- 
action with morphological categories and delimitation from demonstratives. 
More detailed studies on individual phenomena have additionally revealed the 
insights in Sections 4.1 to 4.4. 


4.1 Multiple use of definite articles in nominal phrases 


This is found, inter alia, in Albanian, Modern Greek, Yiddish, Romanian, Arabic, 
Scandinavian and Bavarian (cf. Plank 2003). This multiple occurrence is con- 
nected with the normal and special ordering of adjectives, as in (17). In French, 
superlatives require a double use of the definite article (e.g., Pétudiant le plus in- 
telligent ‘the smartest student’). 


(17) Modern Greek 

a. i kondés füstes 
*the short skirts' 

b. i füstes i kondés 
*the short skirts' 

c. i kondés i füstes 
*the short skirts' 
(Joseph and Philippaki-Warburton 1987: 51) 
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4.2 Co-occurrence with demonstratives, possessives or both 


This is found, inter alia, in languages like the following: Italian, Modern Greek, 
Hungarian, Chamicuro (Amazonian), Polynesian, Tiri (Melanesian), Abkhaz and 
Guarani (cf. Haspelmath 1999). This double marking of definiteness, as in (18), 
seems to be connected with the time of development of the definite article. 


(18) Italian 
Ho perduto la mia giacca. 


‘T’ve lost my jacket.’ 


4.3 Differential/extended use with respect to semantic 
context (generic, mass, deixis, proper nouns) 


This can be observed in European languages. As already pointed out, languages 
differ with respect to the extension of their use to certain contexts. From a syn- 
chronic perspective, we can rank languages according to the frequency with 
which definite articles are used, since there are more or less clear subset relations 
for restrictions on the omission of the definite article (cf. Longobardi 1994, 2001; 
Dahl 2004; Behrens 2005) - which, for some languages, yields roughly the hier- 
archy in (19). 


(19) Greek > Basque > French, Hungarian > German > English 


4.4 Use inside of adpositional phrases 


As pointed out by Himmelmann (1998), definite articles are more exceptional in 
prepositional phrases than in noun phrases. This can clearly be demonstrated for 
languages like Romanian, Albanian, Tagalog, Bantu and Germanic and for loca- 
tive or temporal nouns in Polynesian languages. Such tendencies can also be ob- 
served in specific constructions in many European languages. Himmelmann 
(1998) offers a historical explanation for this asymmetry: definite articles develop 
relatively late and the article-less syntax of prepositional phrases is retained. 

Fine-grained comparisons between European languages clearly reveal such 
asymmetries but the differences tend to be construction-specific and no general- 
izations are possible even across genealogically related languages. In (20) to (23), 
a few examples concerning French, German, Italian and English are given. 
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(20) smell: English = German = Italian « French 
a. This smells of cow, cat, cabbage, flowers. 
b. Das riecht nach Kuh, Katze, Kohl, Blumen. 


c. Ça sent la vache, le chat, le chou, les fleurs. 


(21) manner of motion: English = French = Italian German 
a. go by train/bus/plane/boat/on foot 


b. mit dem Zug/Auto/Fahrrad/Flugzeug/Schiff reisen/fahren (versus zu Fuß 
gehen) 


c. aller à pied/en vélo/en voiture/en avion/en bateau/en avion 


(22) institutions: German = French # English® 
a. go to school/church/work/hospital/prison 
b. zur Schule/Kirche/Arbeit/ins Krankenhaus/Gefüngnis gehen 


c. aller ál'école/à l'église/au travail/à Vhópital/en prison 


(23) musical instruments: variation within English 
a. play the piano/guitar/flute/saxophone/trombone (British English = Ital- 
ian) 


b. play piano/guitar/flute/violin (American English = German) 


In cases like the preceding ones, we enter the domain of non-referential uses of 
definite articles, which may, therefore, often be omitted. 


5 Summary and conclusion 


Definite articles made their first appearance in European languages around the 
turn of the first millennium and are thus not a category inherited from Indo-Eu- 
ropean. What exactly triggered this development of demonstratives is discussed 
controversially in the relevant literature but it is quite plausible to assume that 


8 Detailed corpus studies on variation in the use of the definite article across regional and tex- 
tual varieties of English can be found in Hundt (2016, 2018). Most of the variation pointed out by 
her is a matter of “more or less" rather than "either-or". 
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both their semantic and syntactic properties were involved in their development 
and grammaticalization. Leiss (2000) argues that it was the loss of aspectual dis- 
tinctions and case inflection in Early Germanic that led to emergence of definite 
articles. Quantificational distinctions expressed in some languages by case dis- 
tinctions (partitive versus accusative) or by verbal prefixes can be transferred to 
article systems. On the other hand, definite articles can be regarded as a natural 
extension in the meaning and use of adnominal demonstratives, also found in 
other subclasses of demonstratives. As far as their syntax is concerned, articles 
are structure builders, since they occur at the periphery of a noun phrase, either 
before or, more rarely, after all other modifiers of a noun phrase. In processing 
sentences, we know that the relevant constituent is a noun phrase, as soon as we 
meet an article (Hawkins 2004: 76). 

It was the main goal of this paper to discuss the diversity in syntax, meaning 
and use of definite articles across languages, with a specific areal focus on Eu- 
rope, and, in doing so, complement the typological picture presented in Dryer 
(2014). It was shown that such comparative studies need to have clear semantic 
foundations, which can be provided by formal explications of such notions as 
uniqueness, salience, existence and exhaustiveness, traditionally known to play 
a role in the semantic analysis of definite articles. On the basis of such a compar- 
ative basis, it is possible to reconstruct the historical development of definite ar- 
ticles in its basic outline and to distinguish different types. In addition to mani- 
festing a variety of syntactic differences across languages, definite articles were 
also shown to differ strikingly in their use. 
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Pierre Larrivée, Adeline Patard 
Pathways of evolution, contiguity and 
bridging contexts 


Abstract: Semantic maps, to which Johan van der Auwera has brought a major 
intellectual contribution, are a representation of implicational relations in the ty- 
pological domain. They have increasingly been used to chart historical evolution. 
They are arranged as a series of contiguous cells that define pathways of variation 
and change. The questions raised concern the rationale for the contiguity ar- 
rangement. It is demonstrated on the basis of novel diachronic analyses that the 
cells making up a semantic map should be semantic functions and that the con- 
tiguous arrangement of these functions relates to the existence of bridging con- 
texts. Because evolution from one function to the next is made possible by bridg- 
ing contexts, a specific pathway of function pairs defines the evolution of items 
that can only proceed between cells that share bridging contexts. 


Keywords: language change, bridging contexts, language variation, n-word, 
counterfactuality 


1 Introduction 


Johan van der Auwera has significantly contributed to inform the theoretical de- 
bate about the conditions of typological and diachronic variation in grammar. His 
proposals have helped clarify the nature of the implicational relations that cap- 
ture the extent of possible typological, synchronic and diachronic variation of 
grammatical items. In this contribution, we focus on diachronic change and pro- 
pose an answer to the major question of the cause of ordered pathways of impli- 
cational relations. Why is there a tendency for (a family of) grammatical expres- 
sions to evolve across a series of functions in an orderly way? Why should 
movement verbs, deictic expressions, perfect tenses and indefinites typically 
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evolve into future markers, definite articles, past tenses and negatives respec- 
tively in a number of unrelated languages? The proposal that we develop here is 
that such pathways of evolution exist because contiguous functions share bridg- 
ing contexts. Bridging contexts enable change from one function to the next and 
thus explain the order of contiguous functions that shape a pathway of change. 
While the claim has been made before that bridging contexts play a role for lan- 
guage change (Traugott 2012a and references therein), it has not always been sub- 
stantiated with detailed empirical mapping. This is what this paper does, by look- 
ing at two cross-linguistically well-established cases of change in French. The 
available quantitative evidence regarding the evolution of negative polarity items 
(henceforth NPIs) into negative words (n-words) and of perfect or past tenses into 
modal markers supports the role of bridging contexts as a condition of language 
change and as a determinant of the order of change from one function to the next. 


2 Pathways of evolution, contiguity of functions 
and bridging contexts 


A major type of language variation and change involves expressions that get as- 
sociated with different functions. This conjunction of functions is not random but 
tends to be realized following an ordered pathway that constrains possible con- 
figurations synchronically and diachronically. Pathways of typological variation 
have been represented through semantic maps by Haspelmath (1997). Semantic 
maps are visual representations of implication relationships designed to capture 
typological generalizations. As an illustration, let us look at the semantic map of 
indefinites in Figure 1. 


Question ----- Indirect ------- Direct 
/ negation negation 
Specific --- Specific --- Irrealis 
Known unknown non-specific | 
\ 


Conditional --- Comparative --- Free-choice 


Fig. 1: Haspelmath’s (1997: 64) semantic map of indefinites 


The map visualizes the implication relationship constrained by the contiguity be- 
tween the conjoined cells. Thus, synchronically, (families of) indefinite items that 
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are found in questions can also potentially be found in indirect negation, condi- 
tional clauses and irrealis environments. They cannot “jump” a cell and be used 
with direct negation, in comparatives and as a specific unknown, for instance. 
The contiguity condition similarly constrains diachronic evolution, even though 
semantic maps were not originally designed for that purpose. An item that ex- 
presses irrealis is expected to evolve, if it does, by going successively through the 
various contiguous cells in an orderly way. 

The design of such maps contributes to the theoretical and empirical under- 
standing of variation and change by raising at least two questions, regarding the 
nature of the cells and the condition of contiguity between them. 

On the one hand, indeed, why have these particular cells been chosen? One 
suspects that selection is based on contexts that recur in grammatical descriptive 
work. Not all relevant contexts are, however, included (lexically inherent nega- 
tives, sequences commanded by before), and one is left wondering which should 
and which should not. What is more, as both van der Auwera and Van Alsenoy 
(2011a) and Larrivée (2011) regret, the content of the cells refers to different levels 
of analysis. Specific known, specific unknown and irrealis are semantic catego- 
ries; questions, comparatives, conditionals, indirect negation and direct nega- 
tion are syntactic contexts; and free-choice is both. The unfortunate result is that 
one use of an item could occur in two cells at the same time: specific unknown 
someone could well occur in a conditional (e.g., if someone calls, let me know). 
And the same context could host items with different interpretations: conditional 
can happily host specific indefinites, as we have just seen, but also negative po- 
larity items (such as any in if anyone is found on the grounds, security will be an- 
noyed) and, possibly, free-choice (e.g., if you hang around with just anyone, you’ll 
get into trouble). More generally, the interaction between item and context is not 
considered. To take a fairly obvious case, negation is a licensor for negative po- 
larity (as in the solitary attitude expressed by I do not hang around with anyone 
these days) but it can exert focus on a free-choice item (as in the exclusivity 
claimed by I do not hang around with just anyone). These considerations led van 
der Auwera and Van Alsenoy (2013: 31) to propose a more functional type of cell, 
as in Figure 2. 
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non-specific 
direct negation 


non-specific 
negative polarity 


non-specific 
free choice 


Fig. 2: van der Auwera and Van Alsenoy’s (2013: 31) semantic map of any 


Since both contexts and functions such as specific indefinite, negative polarity 
and n-word characterize the behavior of (a morphological family of indefinite) 
items, Larrivée (2011) suggests that it may be wise to map both. The functions 
would chart the higher-level map, complemented by a set of contexts subordinate 
to each. While the details of the map design need to be worked out, the idea from 
both proposals is that a map designed from functions might account more ro- 
bustly for typological, diachronic and synchronic variation. 

On the other hand, the contiguity condition deriving from implicational rela- 
tionships appears as a desirable constraint on pathways of synchronic and dia- 
chronic variation. Haspelmath (1997) points out that this reveals a variation that 
is more limited than one might expect from a purely Saussurean arbitrary associ- 
ation between a phonological form and a particular meaning. However, particu- 
lar arrangements might make unsupported predictions. Van der Auwera and Van 
Alsenoy (2011b: 335) remark that Dutch niemand ‘no one’ can be used in questions 
and direct negation but not with indirect negation, in contradiction to the conti- 
guity principle. It may thus be that, while the contiguity principle is correct, the 
particular arrangement of cells may not be. Basing the map primarily on func- 
tions rather than contexts may help distinguish between language-specific be- 
havior and general patterns of evolution. However, the question remains as to 
why contiguity of functions should constrain the evolution of items on a pathway 
of change. To take a concrete illustration, why do n-words such as nothing tend 
to emerge from negative polarity items such as anything, which themselves arise 
from specific indefinites such as someone? Why should items not jump a step and 
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go directly from specific indefinites to n-words, for instance? Larrivée (2003) ten- 
tatively proposes that contiguous functions share the largest number of semantic 
traits between them but concedes that this is a potentially circular proposal in 
need of empirical support. The proposal that we put forward and substantiate in 
this paper is that the cause of the contiguity condition between functions on a 
pathway of change is the existence of a bridging context that relates them. It is 
because there is a bridging context between NPIs and n-words that they are reg- 
ularly related through variation and change. Such relations are not normally 
found between specific indefinites and n-words since there are no bridging con- 
texts between them. This proposal maintains a ban on jumping over cells which 
constrains variation in a desirable way while explaining why things are ordered 
and why they are ordered in the way they are. 

The notion of bridging context calls for some clarifications. It can be con- 
ceived of as an environment that is compatible with two interpretations of a given 
expression. Thus, expressions occurring in a bridging context provide input for 
new generations of speakers to reanalyze them and, as such, bridging contexts 
are often considered as the condition of language change. Whether the change is 
necessarily effected by children as the new generation of speakers is a point of 
some debate (see Diessel 2002 and references therein). For one thing, children do 
not have the sociolinguistic prestige to force actuation of change. Moreover, it is 
well-known that language changes during the life of speakers (Diessel 2002). But 
whether it is children or adults who effect the reanalysis, it seems plausible that 
language-internal ambiguity plays an important role in grammatical change. The 
different steps of ambiguity-led change have been discussed by Heine (2002), 
Diewald (2002) and Traugott (2012a, 2012b). We gloss over the details of the vari- 
ous models and of the putative relations between them to summarize the general 
points of agreement. Itis generally assumed that an item can acquire a new mean- 
ing because it occurs in a context that allows both its conventional meaning and 
the new target meaning. Such contexts are known as bridging contexts because 
they act as bridges between source and target. A bridging context must be distin- 
guished from a so-called switch context, which is compatible with the target 
meaning only, exclusive of the source interpretation. An illustration would be the 
interpretation of the present perfect in I have bought the car last year, where the 
temporal phrase makes it clear that a past reference (the target meaning) is at 
stake rather than a resultative one (the source meaning) which would require a 
present reference. In many cases, it is difficult to know whether the target mean- 
ing is a conventional property of the item or just a potential reading of it. The final 
stage of the change is one in which the target meaning is fully conventionalized. 
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A debate concerns the stage at which the target meaning becomes foregrounded 
(Heine 2002; Traugott 2012a). 

Itis striking that, despite the plausible nature of this model and its popularity 
in typological and diachronic research, little empirical support has been adduced 
to buttress it, although works by Diewald and a few others come to mind (e.g., 
Diewald 1999 on German modal verbs; Diewald and Ferrarresi 2008; Giacalone 
Ramat and Mauri 2009 on the temporal and adversative adverb tuttavia). What 
would prove that a context promotes change from a source to a target meaning? 
An experimental protocol is proposed by Cournane (2014), which is very interest- 
ing but clearly quite elaborate, and it is not clear whether it can be replicated for 
all changes. A replicable approach is one that examines corpus data with a quan- 
titative method. Traugott (2012b) provides a detailed analysis of future markers 
developing from the movement expression be going to. The compatibility of both 
readings with a majority of contexts of use for a long period makes the precise 
bridging context difficult to pin down. This raises the question whether there is 
always a bridging context for change and whether all changes involve ambiguity 
between the source and target interpretation. Diewald (1999) claims that opacity 
of the source meaning is sufficient for change to take place. Traugott and 
Trousdale (2014) point out that no obvious ambiguous context comes to mind as 
a candidate for bridging changes in information structure value. The questions of 
whether all changes involve ambiguity arising from one precise bridging context 
could be answered confidently once a sufficient body of quantified empirical re- 
sults has been brought together to characterize changes where ambiguous con- 
texts play a definite role. 

In this section, we have proposed that regular language change finds its 
cause in the occurrence of an expression in an ambiguous context that allows it 
to be reanalyzed with a new function. The grammatical changes thus mapped 
should correspond to the contiguous cells of the relevant semantic map, espe- 
cially if the contents of cells relate to functions. Empirical evidence with respect 
to the role of bridging contexts in grammatical change is provided by the two fol- 
lowing sections. 


3 From NPIs to n-words? 


In the previous section, we have suggested that the reason why grammatical 
items evolve from one function to the next in an orderly way is that these func- 
tions are related by a bridging context. However, the demonstration that such 
bridging contexts play a causal role in language change, in the cases where they 
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can reasonably be expected to play such a role, remains, on the whole, to be pro- 
vided. This is what this section contributes to by examining critical aspects of the 
evolution of n-words in medieval French. 

There is substantial empirical evidence in a well-documented language like 
French that items that became n-words like no one were NPIs like anyone in pre- 
vious historical periods. This evolution is documented in a number of traditional 
and recent studies (e.g., Martineau and Déprez 2004; Prévost and Schnedecker 
2004; Vanderheyden 2010; Ingham 2011; Kallel and Ingham 2014; Labelle and 
Espinal 2014). Martineau and Déprez (2004) provide quantitative data showing 
that the positive indefinite use of rien, as in (1), and aucun, disappears during the 
17th century, that the NPI use, as in (2), varies between 10% and 30% in the me- 
dieval period and that a majority of negative uses, as in (3), is found from the 17th 
century onward. 


(1) Quant la rien que ge plus amoie / Voi morte, vie que me vait? 
‘When the thing that I loved most I see dead what is life worth to me?’ 
(Foulet 1970: 272) 


(2) Honnis soit ki rien lour donra. 
‘May he be casted out he who will give them anything.’ 
(Foulet 1970: 275) 


(3) Il (ne) leur donne rien. 
‘He gives them nothing.’ 


However, the chronology seems conservative in that, for instance, NPI uses are 
uncommon in contemporary vernacular French. This may be related to the fact 
that the data used come from the literary corpus Frantext. Moreover, there is no 
breakdown of figures per main contexts of use. Based on administrative material 
presumably closer to the vernacular practice than literary sources, Ingham (2011) 
and Kallel and Ingham (2014) provide partial quantitative information on some 
contexts of use of future n-words, more specifically, in conditional clauses. How- 
ever, this context is chosen because it epitomizes the NPI function of items and is 
therefore unlikely to be the critical context that makes the change from NPI to n- 
word possible. 

This highlights the issue of which contexts should particularly be paid atten- 
tion to in the study of change from NPI to n-word. Which context can be consid- 
ered as a bridging context between NPI and n-word functions? Remember that a 
bridging context is one which is compatible with both functions of an item, in 
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which therefore reanalysis can occur. In other words, a bridging context from NPI 
to n-word should be compatible with both interpretations of an item. Haspelmath 
(1997: 154) and Breitbarth (2014: 60) propose that comparatives could be the con- 
text in question. If so, one would expect comparatives to allow both for an NPI 
and n-word reading to occur with target expressions. However, the expected am- 
biguity does not seem to arise. Consider the sequences in (4) to (7) with English 
NPI and n-words in the equality and superiority comparatives. 


(4) He's as good as anyone. 
(5) He's better than anyone. 
(6) ?? He's as good as no one. 
(7) ?? He's better than no one. 


The NPI reading is available in converging contexts (4) and (5), which contain an 
NPI. However, with the English n-word in (6) and (7), it is not clear whether an 
NPI reading is possible: if acceptable, these sentences do not assert that the per- 
son in question is better than anyone; they rather deny that they are better at all. 
So English shows a complementary distribution, where the NPI has an NPI read- 
ing and the n-word an n-word reading. The expected ambiguity is not provided 
by comparative contexts. 

There is one context that does provide the expected ambiguity. It is that of a 
strong NPI, under the direct scope of without and sentential negation.' Consider 
(8) and (9). 


(8) He was left without anyone to talk to. 


(9) He was left without no one to talk to. 


1 Following standard formal definitions, whereas weak NPIs depend on a downward entailing 
environment where there is an entailment from superset to set (I didn't eat a single vegetable 
entails I didn't eat kale). Strong NPIs depend on the more stringently defined anti-additive prop- 
erty such that a negation of conjoined phrases implies the conjunction of two negated sequences 
(no kale or spinach was sold entails no kale was sold and no spinach was sold). For more details 
on this, see, for instance, Krifka (1994). 
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The NPI reading seems available in both sequences. Although the without no one 
sequence might be frowned upon by prescriptivists, it seems prevalent enough 
and the attestation in (10) raises no comprehension problem. 


(10) Have you ever ridden a bus or the train feeling bored without no one to talk to 
and wishing that you were listening to something good? 
(http://www.sampleessaytopics.info/trying-to-sell-an-ipod-essay) 


Thus, both NPIs and n-words can occur in strong polarity environments with an 
NPI reading. How does that fact show that an NPI can be reanalyzed as an n- 
word? It does so because, in that context, an NPI is indistinguishable from an n- 
word. An item that occurs in that context can therefore be reanalyzed as an n- 
word by some speakers. In (11), the item aucun, under the command of ‘without’ 
and clausal negation, can be analyzed either as a commanded NPI or as a con- 
cording n-word. 


(11) a. ledict suppliant a esté et est de bonne vye et renommée et s'est honnora- 
blement gouverné, sans jamays avoir esté attainct ne convaincu d'aucun 
villain cas 
*the said supplicant is and has been of good repute and had behaved 
well, without ever having been convicted of any/no judicial case before' 

b. dudit suppliant qui n'avoit jamais eu aucune disputte ne querelle avecq 
ledit deffunct 
‘the said supplicant who never has had any/no dispute with the said 
dead man' 


What is crucial is the inability, in that context, to distinguish whether one is deal- 
ing with an NPI or an n-word, making the reanalysis possible. 

The question of which context bridges the two readings remains an empirical 
matter, however. It would be necessary to examine cases of evolution from NPI 
to n-word to establish whether strong NPI environments indeed play a particular 
role at the period of change. The expectation is that, if strong NPI contexts allow 
NPIs to be reinterpreted as n-words, they should be preponderant right before 
and during the period of change. How preponderance is to be understood is still 
to be established. This gap is what Kallel and Larrivée (2016) are intending to fill. 
This study aims to determine what the bridging context is in the evolution from 
NPI to n-word functions, whether it corresponds to a strong NPI context and 
whether this context can be demonstrated to play a preponderant role at the time 
of change. One way to establish such preponderance would be to determine that 
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there is a significantly higher rate of use of the relevant items in strong NPI con- 
texts before they acquire a majority of n-word uses. In order to answer these ques- 
tions, the study examines the evolution of contemporary French n-words. French 
is a well-documented language, with continuous prose material for a variety of 
genres since the end of the 13th century. This variety of genres allows to focus on 
material other than literary texts. The reason to do so is the stylistic dimension of 
literary genres that tends to make the language used more conservative and fur- 
ther removed from the immediate competence of speakers. The evolution of a new 
series of n-words is further known to be taking place in medieval French. One 
such evolution is that of aucun ‘no(ne)’, which goes from a specific indefinite 
equivalent to ‘some’ to an NPI equivalent to ‘any’ and, in a final stage, to an n- 
word equivalent to ‘none’ by the 17th century. Other n-words have already com- 
pleted a similar evolution, like rien ‘nothing’ by 1300 or are yet to complete it, like 
personne ‘nobody’, which starts having n-word uses by 1700. In our study, the 
behavior of aucun was examined in a set of strictly comparable judiciary texts 
called remission letters. These are letters in which a culprit describes in narrative 
format the crime he or she has committed and asks for royal pardon against a 
financial payment. The document is drafted at the local level and received, final- 
ized and archived by the Royal Chancery. This means that the language broadly 
reflects that of a single geographic region, i.e., Paris. Such letters were produced 
from the mid-14th century to the 18th century, are plentiful, dated and localized 
and tend to be edited without the extensive interventions that literary texts may 
have been subjected to, including normalization of orthography and insertion of 
punctuation. Groups of remission letters separated by about 50 years have there- 
fore been looked at. The occurrences of aucun were extracted and each catego- 
rized as to whether it represented a specific indefinite, an NPI in weak or strong 
context or a pre-verbal or post-verbal n-word. Furthermore, in order to exclude 
formulaic sequences that might not reflect the immediate competence of speak- 
ers, strings including aucun that were likely to represent fixed phrases were ex- 
cluded. The results are presented in Table 1. 
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Tab. 1: Evolution of functions and contexts for productive uses of aucun in remission letters 


Neg NPI Specificin- Totals 
definite 

Pre-verbal Post-verbal Strong Weak 
1357-1360 0 2 12 26 11 51 
1422-1435 3 24 18 6 57 102 
1487 2 61 49 11 126 241 
1531-1532 2 35 56 3 70 169 
1580-1600 0 41 91 24 16 172 
1688-1783 1 9 5 1 0 16 


The important pieces of information are the emergence of the preponderant n- 
word function and the proportion of uses in strong NPI contexts. It is not before 
the latest period that aucun most frequently functions as an n-word. That period 
is represented by only a small number of occurrences: the corpus of later remis- 
sion letters is smaller due to the destruction of many sources at the time of the 
Revolution. Nonetheless, this is only a partial concern since it is independently 
known that aucun had gained n-word status by the 17th century. The crucial ob- 
servation supported by a sufficient number of occurrences is that of the prepon- 
derant function at a given period. The strong NPI context represents over 50% of 
the 172 cases of aucun in the late 16th-century period. The use in the bridging 
context at preponderant rates obtains in the period that immediately precedes 
the period in which aucun has a majority of n-word uses. 

These results are important in that they support the bridging context scenario 
of language change. It supports the view that the strong NPI context plays the 
role of bridging context between the two contiguous functions of NPI and n- 
words. The reported investigation shows that the bridging context represents 
over 5096 of occurrences in the period immediately before the change. There is 
therefore a suggestion that the hypothesized preponderance of bridging contexts 
immediately before the change is the case and may be represented by a threshold 
of half the occurrences. 

In this section, we have reported on one recent study (Kallel and Larrivée 
2016) that provides empirical support for the bridging context scenario of change. 
The new evidence shows that the use in a bridging context is preponderant just 
before a change occurs. Of course, more data is needed to confirm this. If con- 
firmed, the results help resolve a mystery about grammatical variation, as under- 
lined by semantic maps, which is why evolution occurs along an orderly pathway 
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of functions. Change occurs from one function to the next without a jump because 
pairs of functions are related to one another by specific bridging contexts that do 
not relate non-adjacent functions. Whether bridging contexts can be demon- 
strated for other types of change is pursued in the next section. 


4 From perfect/past to counterfactuality 


Another well-known case of semantic variation exhibited by grammatical items 
is that of perfect/past markers which may also express modality. Indeed, in many 
languages, it is the same markers that give a perfect viewpoint on a situation or 
refer to the past and, in parallel, convey a modal attitude toward the speech con- 
tent or toward the hearer (hypothesis, counterfactuality, politeness, suggestion 
and so on). Examples (12) to (15) are a few illustrations from English. 


(12) If I won the lottery, I would travel the world. 
(13) I wish he had invited me to his birthday. 
(14) I wanted to ask you a favor. 

(15) It's high time we came back home. 


This functional variation is well-documented cross-linguistically (e.g., James 
1982; Fleischman 1989; Thieroff 1999; Iatridou 2000; Van linden and Verstraete 
2008) but barely diagrammed by means of semantic maps (but see Patard 2014 
for the modal uses of preterits and imperfects). We would like to argue that the 
two types of functions are not randomly conjoined across languages but that they 
are historically connected by bridging contexts allowing for semantic reanalysis. 
The existence of bridging contexts also attests to the fact that the connection be- 
tween the two functions follows an orderly pathway from tense/aspect to modal- 
ity and not the other way round (see also Patard 2014). In the section, we focus 
on one specific type of such functional variation: perfect or past markers meaning 
counterfactuality. Bridging contexts between these semantic functions will be ev- 
idenced by historical data concerning one verbal tense from French: the so-called 
conditionnel passé. 
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The conditionnel passé is morphologically the perfect version of the condi- 
tional tense, which is etymologically an alethic? periphrasis conjugated in the im- 
perfect tense (Benveniste 1974; Bourova 2005). Originally, the conditional tense 
— and therefore its perfect version, the conditionnel passé too — expresses two dis- 
tinct functions (Patard 2017): a future-of-the-past function (in indirect speech) — 
the situation lies in the future of a past moment (generally expressed by a verbum 
dicendi) — and a hypothetical function — the speaker does not specify whether the 
situation is true or false (‘neither p nor non-p’). Hence, the conditionnel passé may 
express the anteriority of a future-in-the-past situation, as in (16), or of a hypo- 
thetical situation, as in (17). 


(16) Je me dis ... qu'elle ne pourrait plus rien exiger quand je l'aurais replacée au 
milieu de sa famille. 


‘I told myself ... that she would not demand anything when I had put her 
within her family.' 


(B. Constant, 19th century) 


(17) Sicette clefne quittait jamais Mlle Stangerson, l'assassin aurait donc attendu 
Mlle Stangerson cette nuit-là, dans sa chambre, pour lui voler cette clef. 


‘If this key never left Miss Stangerson, then the murderer would have waited 
for Miss Stangerson this very night, in her room, to steal the key.’ 


(G. Leroux, 20th century) 


However, in Modern French, the conditionnel passé further conveys counterfac- 
tuality: the denoted situation is contrary-to-facts; it cannot be helped anymore 
(‘non-p’). This reading, which has become by far the most frequent in Modern 
French, is obtained, for instance, in if-sentences, as in (18), or with a modal verb, 
like deontic devoir in (19). 


(18) Si je m'étais obstiné à aller à Rome, j'aurais perdu Milan. 
‘If I (had?) persisted to go to Rome, I would have lost Milan.’ 
(B. Napoléon, 19th century) 


2 According to Bourova and Tasmowski (2007: 28-29), the Latin periphrasis INF + habere con- 
veys alethic necessity lato sensu, i.e., a logical necessity: the situation is necessarily the case in 
whatever possible world. Its meaning is close to that of the English construction be to + INF. 


198 — Pierre Larrivée, Adeline Patard 


(18) J'aurais dû fuir, je n'osais pas. 
‘I should have run away, (but) I did not dare to.’ 


(G. Bernanos, 20th century) 


A closer examination of the synchronic data confirms what van der Auwera and 
Van Alsenoy (2011b, 2013) and Larrivée (2011) recommend for the mapping of 
grammatical variation, i.e., that it should primarily rely on functional variation 
and not on syntactic contexts. Notably, we may observe that the two epistemic 
interpretations of the conditionnel passé (hypotheticality and counterfactuality) 
share common syntactic contexts: they may both occur in conditional sentences 
but also in combination with modal verbs. Indeed, the hypothetical reading is 
also possible with a modal verb, like epistemic pouvoir in (20), in contexts where 
the speaker ignores the reality status of the situation. 


(20) Cet ceil n'appartenait pas non plus à Alain Kernoul, comme Hervé aurait pu 
le croire. 


‘This eye does not belong either to Alain Kernoul, as Hervé might have 
thought.’ 


(F. Du Boisgobey, 19th century) 


These facts suggest that syntactic contexts are not always relevant to map gram- 
matical variation. If we were to draw the semantic map of the conditionnel passé, 
the distinction between the use in conditional sentences, the use with modal 
verbs and the use in other contexts would not be discriminative. The map would 
rather rely on semantic functions: future-of-the-past, hypotheticality and coun- 
terfactuality. 

The evolution of the conditionnel passé further raises the question of dia- 
chronic variation: how can a perfect (conditional) tense turn into a counterfactual 
marker? Was this evolution conditioned by bridging contexts? A quantitative 
study of a diachronic corpus (Patard, Grabar and De Mulder 2015) allows us to 
trace the semantic evolution of the conditionnel passé, as in Figure 3.3 


3 The corpus covers the period from the 11th century to the 20th century and is composed of 152 
texts (literary for the most part) distributed into ten sub-corpora representing each century. The 
whole corpus numbers almost 9.8 million words (see Patard, Grabar and De Mulder 2015 for more 
details). 
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Fig. 3: Evolution of the semantic functions of the conditionnel passé (normalized frequency per 
million words) 


The data show that the counterfactual reading was possible in quite an early pe- 
riod, from the 13th century, only one century after the first attestations of the con- 
ditionnel passé. At that time, the counterfactual reading was extremely rare, 
though, the main interpretations being those of anterior future-of-the-past and 
anterior hypothetical marker — see (16) and (17). The counterfactual interpreta- 
tion only expanded from the 17th century, yielding a sweeping rise in frequency 
(from around ten occurrences per 100,000 words to more than 75 occurrences per 
100,000 words after the 17th century). 

We may consider that bridging contexts exist since the 13th century when the 
hypothetical conditionnel passé starts licensing a counterfactual interpretation 
(even if this is sporadic until the 17th century, as mentioned before).^ In these 
bridging contexts, the conditionnel passé is a hypothetical marker - it says noth- 
ing about the reality status of the hypothesized situation (‘neither p nor non-p’) 
— which gives a perfect viewpoint that usually gives rise to an anteriority inter- 
pretation — the situation is anterior to another situation given in the context. 
Counterfactuality may then be expressed by contextual items which specify the 
epistemic status of the hypothetical situation as being contrary-to-facts (‘non-p’). 
Thus, in earlier texts, we typically find the use of pluperfect subjunctives in the 
protasis, such as eust été in (21). 


4 The future-of-the-past conditionnel passé found in indirect speech contexts is not concerned 
by the semantic change. 
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(21) Si lui dirent li pluiseurs que, SE il eust esté au pais, que cil de l'Escorta n’au- 
roient fait ce que il firent. 


‘Thus, most of them told him that IF he had been in the country, men from 
Escorta would not have done what they did.’ 


(Anonymous, Chroniques de la Morée, 14th century) 


But, crucially, counterfactuality is the result of a reinterpretation of the condition- 
nel passé that is induced by the context. This reading is brought by two successive 
inferences: (i) a past entailment and (ii) a counterfactual implicature. 

First, the perfect aspect of the conditionnel passé entails past when it refers 
to situations that are anterior to other past situations described in the contexts. 
For instance, in (21), the state of affairs described by the conditionnel passé 
(‘would not have done’) is anterior to the past situation denoted by dirent ‘told’. 
The perfect aspect of the conditionnel passé consequently entails that the hypoth- 
esized situation is past. The past entailment will conventionalize from the 17th 
century and overshadow the perfect aspect in certain contexts one may associate 
to Heine's (2002) switch contexts, thus following the evolution pathway of perfect 
forms suggested by Bybee, Perkins and Pagliuca (1994: 105): resultative » anterior 
» past. 

Then, the past interpretation joined together with the hypothetical meaning 
implicates counterfactuality due to our experience and conception of time. Time 
is perceived and conceived (at least in Western cultures) as asymmetrical: past is 
the domain of the irrevocable and the known, as opposed to the future, which is 
the domain of the possible and the unknown. As a consequence, when a speaker 
talks about a hypothetical situation that is past, he suggests, by default, that the 
situation was not the case, because we usually know what happened. In short, 
making a hypothesis about the past implicates counterfactuality. That is how the 
conditionnel passé allows for the interpretation of the counterfactual meaning 
that will conventionalize in the 17th century. 

To sum up, the bridging contexts of the conditionnel passé are contexts where 
(i) the denoted hypothetical situation is anterior to a past situation, as in (21), and 
(ii) the speaker knows what happened in the past (this is the case by default). 
Thus, in the case of the conditionnel passé, critical contexts do not correspond to 
a specific morphosyntactic environment but to a functional type of context, in 
which anteriority to the past is expressed in a context of known past. This seems 
to give further confirmation to the fact that a functional mapping of grammatical 
variation and evolution may be more appropriate. Moreover, it is interesting to 
note that, a century before the increase of counterfactual contexts, namely in the 
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16th century, the contexts where the conditionnel passé serves to denote past sit- 
uations have also increased by comparison with the previous period. This is part 
of the general aorist drift that affects French perfects from Old French following 
Bybee, Perkins and Pagliuca's (1994) evolution pathway: resultative > anterior > 
past. This trend is illustrated in Figure 4, adapted from Patard, Grabar and De 
Mulder et al. (2015). 
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= Anteriority 


40 ji Past 
20 - — — 
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= 


llth 12th 13th 14th 15th 16th 17th 18th 19th 20th 


Fig. 4: Evolution of anterior and past interpretations (percentages)? 


In other words, the conditional passé tends more to entail past from the 16th cen- 
tury, thus paving the way for the swiping rise of counterfactual contexts in the 
following century. Notice, however, that these past contexts do not necessarily 
correspond to bridging contexts — far from it — since the counterfactual implica- 
ture remains very marginal at that time (compare with Figure 3). Indeed, most 
cases correspond to contexts of ignorance, where the speaker does not know 
what happened in the past. Hence, the epistemic status of the past hypothetical 
situation remains indeterminate (‘neither p nor non-p’). This is the case for in- 
stance in interrogative contexts. So bridging contexts are not only contexts in- 
ducing past entailment but also contexts of epistemic ignorance. 


5 Anteriority was coded for cases where the denoted situation is anterior to another situation 
described in the same sentence. Past is coded for cases where the situation precedes the time of 
utterance. As a consequence, the same token could be coded for both anteriority and past (see 
Patard, Grabar and De Mulder 2015 for more details). 

6 An example is ce maulvais vent, qui court, t'auroit il bien poulsé hors de la Court? (Marot, L'Ad- 
olescence Clémentine, 16th century) 'this ill wind, which blows, would it have indeed pushed you 
outside the Court?’. 
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One may finally point out that the bridging contexts we have just described 
do not exactly follow the mainstream definition according to which both the 
source meaning and the target meaning are available. In the case of the condi- 
tionnel passé, bridging contexts are not ambiguous because the hypothetical 
meaning and the counterfactual meaning cannot be interpreted at the same time: 
either the reality status of the situation is unspecified (‘neither p nor non-p’) or it 
is specified as counterfactual (‘non-p’) but it cannot be both. This suggests that 
bridging contexts have more to do with the interpretation of pragmatic inferences 
than with semantic ambiguity. 

To conclude, the inference of counterfactuality in bridging contexts implies 
that the semantic evolution of the conditionnel passé follows a pathway from per- 
fect to past and, ultimately, to counterfactuality. This shows that the conjunction 
of these functions across languages is not random but that they are historically 
connected by the conventionalization of pragmatic inferences. The evidence for 
bridging contexts further involves that the connections between these functions 
are strictly oriented, from aspect to tense and to modality (and not the other way 
round). 

Furthermore, we may note that the inference of new meaning in bridging 
contexts is not a sufficient condition for semantic change. It is only during the 
following stage of switch contexts that the new grammatical interpretation ex- 
pands and conventionalizes as an effect of its increased frequency. In the case of 
the conditionnel passé, the multiplication of contexts allowing for the counterfac- 
tual inference is clearly caused by systemic changes occurring in the same period, 
namely the decline of the competing subjunctive forms (imperfect and pluperfect 
subjunctives), the generalization of the non-perfect conditional tense in hypo- 
theticals and the development of the aorist interpretation of perfect forms (see 
Patard, Grabar and De Mulder 2015 for a detailed analysis). Bridging contexts 
thus appear to be a necessary condition rather than a causal factor. 


5 Conclusion 


In this paper, we have revisited the issue of language-internal grammatical 
change in connection with semantic maps, by examining two diachronic evolu- 
tions in French: that of NPIs into n-words and that of the conditionnel passé from 
hypothetical perfect to counterfactual past. 

Semantic maps define pathways of evolution by a series of contiguous cells. 
These cells, we have insisted, represent functions rather than contexts. Moves 
between contiguous functions thus shape diachronic evolution. This raises the 
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question why such a contiguity condition should constrain grammatical change 
in the way it does. We have brought together evidence to support the claim that 
the contiguity condition is a consequence of the way change occurs. The modality 
of change is through use in contexts that, at least in some cases, allow ambiguity 
between two functions. A bridging context allows for items with an established 
function to acquire a new function. This explains why evolution is mapped by an 
orderly pattern of functions that are related by bridging contexts. There are no 
jumps across functions because there are no bridging contexts between non-con- 
tiguous functions. Thus, this paper has empirically substantiated the idea that 
evolution is constrained by bridging contexts. It has also highlighted the role of 
frequency in grammatical change by showing that, in both cases, the proportion 
of potentially critical contexts — i.e., strong negative polarity contexts and past 
contexts — has increased during the period immediately preceding the functional 
change. 

With a view to future work, we note that the identity and role of bridging con- 
texts will vary according to the evolution under consideration. It should be clear 
from our discussion that, while the evolution of NPIs and the conditionnel passé 
relate to bridging contexts, they do so in a slightly different way. The status of an 
item as NPI or n-word is ambiguous in bridging contexts (strong negative polarity 
environments) because an NPI is indistinguishable from an n-word in such con- 
texts. By contrast, the reading of the conditionnel passé in bridging contexts (con- 
texts with anteriority to past situations and epistemic ignorance) is not ambigu- 
ous but unequivocally counterfactual, the new meaning being crucially inferred 
from contextual information. In other terms, semantic ambiguity seems crucial 
in one case while bridging contexts rely more on pragmatic inferences in the 
other. This raises the issue of the nature and definition of bridging contexts, 
which seem to vary as to the way they allow for reanalysis - an issue that remains 
to be explored in future research. 
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Jacques Moeschler 
On the pragmatics of logical connectives 


Are connectives truth-functional? 


Abstract: This paper discusses the issue of connectives in natural language, 
adopting a formalist approach in pragmatics. The outcome is that truth-condi- 
tional connectives are limited to conjunction, disjunction and conditional, and 
that negation, even in its metalinguistic and non-truth-conditional usages, has 
representational contextual effects, as suppressing a proposition and a presup- 
position or strengthening a proposition. As regards discourse connectives, they 
exhibits a strong pragmatic property, that is, they almost all exhibit factivity. Fi- 
nally, quasi-synonym connectives, as causal ones, do not differ in meaning but 
in the way their conceptual and procedural meanings are distributed at different 
layers. 


Keywords: conjunction, disjunction, conditional, negation, causal connectives, 
concessive connectives 


1 Introduction 


A small subset of connectives in natural language are clear counterparts of logi- 
cal connectives. As we shall see, this is not tantamount to a question of identity 
of meaning rather than to a question of lexical entry, since the logical conjunc- 
tion, disjunction, conditional as well as the operator of negation are linguistically 
translated in English as and, or, if and not. 

Even if the relation between logical and linguistic meaning has given rise to 
a huge literature in semantics and pragmatics (e.g., Allwood, Andersson and 
Dahl 1977; Gazdar 1979; Mauri 2008; Humberstone 2011; Mauri and van der Au- 
wera 2012), it is not completely clear whether linguistic uses of logical connec- 
tives should be connected to their logical meaning, that is, to truth-conditional 
meaning. 
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In a nutshell, there are three principal positions about the relation between 
logical and linguistic meanings of logical connectives: a formalist account, rep- 
resented by Gazdar (1979), a non-formalist one, represented by Ducrot (1989), 
and a Gricean perspective, mainly represented by neo-Gricean pragmatics (Horn 
1972, 1984, 1989; Levinson 2000) and post-Gricean pragmatics (Blakemore 1987; 
Sperber and Wilson 1995; Carston 2002). 

In this contribution, I will mainly advocate the formalist approach, even 
though this position seems to be the first to be abandoned due to the scarcity of 
its empirical covering. Though I have recently defended the Gricean perspective 
(Moeschler 2010, 2017a), I would like to formulate a very counterintuitive expla- 
nation. The main argument I will develop is that the formalist view is the only one 
which seems to give rise to some conclusions about semantic universals and gen- 
eral pragmatic principles. I will also point to some drawbacks of the neo- and 
post-Gricean explanations and will address a more general issue which is con- 
nected to the semantics-pragmatics interface (Moeschler forthc.). 

In this paper, I will not deal with the non-formalist approaches, even though 
they address interesting questions from a pragmatic point of view (see Moeschler 
2017a for a general comment on these approaches). 


2 The classic Gricean approach 


Let us start with the classic Gricean account of logical connectives. In the first 
pages of Logic and conversation, Grice (1975: 41, 43) addresses the issue of logical 
connectives as an introduction to his theory of meaning, claiming that the stand- 
ard approaches in logic and philosophy of language - the formalist and the non- 
formalist ones — do not address accurately the relation between logical devices 
and natural languages: 


It is a commonplace of philosophical logic that there are, or appear to be, divergences in 
meaning between, on the one hand, at least some of what I shall call the FORMAL devices 
-= ~, A, V, D, (x), 3(x), [x (when these are given a standard two-valued interpretation) — and, 
on the other, what are taken to be their analogues or counterparts in natural language — 
such expressions as not, and, or, if, all, some (or at least one), the. ... I wish, rather, to main- 
tain that the common assumption of the contestants that the divergences do in fact exist is 
(broadly speaking) a common mistake, and that the mistake arises from an inadequate at- 
tention to the nature and importance of the conditions governing conversation. 


In a nutshell, the Gricean story works like this: logical connectives have as se- 
mantics their logical truth-conditional meaning while their pragmatic uses are 


On the pragmatics of logical connectives — 209 


derived by implicature. For instance, the temporal meaning of the conjunction 
and is the result of the respect of the maxim of order (“be orderly”), the exclusive 
meaning ofa disjunction (or) is the result of the respect of the first maxim of quan- 
tity (“make your contribution as informative as required (for the current purposes 
of the exchange)”), and the biconditional reading of if (conditional perfection) is 
a by-product of the second maxim of quantity (“do not make your contribution 
more informative than is required”), the maxim of relevance (“be relevant”) and 
the submaxim of brevity (“be brief”). Examples (1) to (3) are illustrations of these 
uses. 


(1 He took off his trousers and got into bed. 
Implicature: ‘He took off his trousers and then got into bed.’ 


(2) Onamenu: Cheese or dessert. 
Implicature: ‘Not cheese and dessert.’ 


(3) Ifyou mow the lawn, I owe you 10 euros. 
Implicature: ‘If you don’t mow the lawn, I don’t owe you 10 euros.’ 


Let us examine these three connectives. 


2.1 And-pragmatic meaning: implicature or explicature? 


The implicature type of solution meets with a lot of issues, however, the more 
serious having been addressed by Cohen (1971), Carston (2002) and Wilson and 
Sperber (2012) (see Blochowiak, Castelain and Moeschler 2015 for an experi- 
mental testing of this problem). First, as regards and-implicature, the Gricean ex- 
planation says nothing about when and is interpreted as ‘and as a consequence’. 
In other words, the causal interpretation of and cannot be explained by any con- 
versational maxim. So, in this case, (4) should be restricted to the meaning ‘and 
then’, and not ‘and as a consequence’: 


(4) He turned the key and the engine started. 
a. ‘He turned the key and then the engine started.’ 
b. ‘He turned the key and as a consequence the engine started.’ 


This has led Levinson (2000: 37, 114-115) to propose a different stance, in which 
the hearer is invited to enrich and-interpretation as far as he can, to obtain the 
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most informative interpretation, via an I-implicature. In that case, the interpreta- 
tion depends on the I-Principle and the I-heuristics: 


I-Heuristics 
What is expressed simply is stereotypically exemplified. 


I-Principle 

Speaker’s maxim: the maxim of Minimization; “Say as little as necessary”. ... 

Recipient’s corollary: The Enrichment Rule. Amplify the informational content of the 
speaker’s utterance, by finding the most specific interpretation. 


So, the I-heuristic invites the addressee to go up to the temporal, causal and con- 
sequence interpretation, the latter being the most informative one. 

However, this explanation has given rise to some crucial issues, mainly be- 
cause the non-truth-conditional pragmatic interpretation for and cannot explain 
why (5) is not a tautology. In effect, if P and Q and Q and P are truth-conditionally 
equivalent and P and then Q is an implicature, (5) should give rise, from a seman- 
tic point of view, to a non-informative reading and its logical structure (6) should 
be trivially true (from Wilson and Sperber 2012: 171). 


(5) It’s always the same thing at parties: either I get drunk and no-one will talk to 
me or no-one will take to me and I get drunk. 


(6) PandQ,orQ and P 


Manifestly, in (5), P and Q does not equal truth-conditionally Q and P, which has 
led Wilson and Sperber (2012) to propose that the and-enrichment is a process 
occurring not at the level of implicature but at the level of explicature, that is, the 
enriched and developed full propositional form. So, the only pragmatic reading 
of (5) is the result ofa pragmatic process, occurring at the level of explicit content, 
as in (7). 


(7) Pand then Q, or Q and then P 


In sum, the and case, which seems to be a paradigmatic case, combining both the 
logical meaning of a conjunction and the non-truth-conditional meaning of a 
generalized conversational implicature, does not bring any solution to the prag- 
matic enrichment issue. 

The two other cases, or and if, illustrate similar problems, but with different 
consequences. 
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2.2 Or-meaning: why implicatures are not enough 


The second problematic case is illustrated by the or-implicature. In a Gricean 
framework, or triggers an implicature because the speaker has a choice for a 
stronger alternative, that is, and. So, if the speaker cannot assert P and Q, he pre- 
fers to assert the strongest statement possible which does not flout the first 
maxim of Quality (“do not say what you believe to be false”). In other words, the 
use of or communicates the speaker’s ignorance: he does not know which dis- 
junct is true. This is clearly the case in the Gricean example in (8). 


(8) Daughter: Where is mom? 
Father: In the bathroom or the kitchen. 


In this case, as the mother cannot be in both places at the same place, the only 
possible interpretation of or is its exclusive meaning, which is a stronger truth- 
conditional meaning than the logical inclusive disjunction connective. Moreover, 
as regards its truth-conditional meaning, and is the strongest connective, since 
the conjunction is only true if both conjuncts are true. 


Tab. 1: Truth conditions for conjunction (A), inclusive (v) and exclusive disjunction (v) 


P Q PAQ PvQ PvQ 
1 1 1 1 0 
1 0 0 1 1 
0 1 0 1 1 
0 0 0 0 0 


However, the Gricean explanation does not make explicit the reason why the 
stronger connective for or is and and does not explain why the or-reading is not 
inclusive but exclusive. In other words, the explanation via the first Quantity 
maxim does not explain why this or-interpretation is more restricted and ex- 
cludes the situation in which both disjuncts are true. 

It is only in the neo-Gricean framework that this explanation is provided. 
More specifically, in Horn’s (1972) proposal, the relevant notion is that of seman- 
tic scale: in a semantic scale <a, >, where fis a weaker expression than a, that 
is, œ entails 5, the assertion of f implicates the negation of a. Gazdar (1979: 59) 
gives the following definition of a scalar (potential) implicature: 
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©scalar-quantity-implicates that the speaker knows that it is not the case that if and only 
if there is some sentence Y”, just like % except that it contains a “weaker” scalar expres- 
sion, and which is entailed by and is either identical to dor form a part of it. 


Now, how is it possible for or to implicate its exclusive meaning? Gazdar's (1979) 
reasoning uses a logical demonstration, based on or-implicature. Suppose that 
the semantic scale is «and, or^, as the truth conditions in Table 1 show. Then, the 
two following entailment (>) and implicature (+>) relations hold, given in (9): 


(9) Entailment  PandQ>PorQ 
Implicature: P or Q +> not (P and Q) 


Gazdar’s (1979: 59)’s simplified demonstration is given in (10). 


(10) i. PvQ 
ii. «(PA Q) implicature of (i) 
iii.P v Q entailed by (i) and (iii) 


Step (iii) must be explained: how can the exclusive meaning of a disjunction be 
entailed both by the inclusive disjunction and its implicature? Here is the expla- 
nation: the or-exclusive meaning is the conjunction of or-semantics and or-scalar 
implicature. This is made explicit in Table 2. 


Tab. 2: Truth conditions for exclusive or 


P Q PvQ PAQ 4(PAQ PvQ)A-(PAQ) PvQ 
1 1 1 1 0 0 0 
1 0 1 0 1 1 1 
0 1 1 0 1i 1 1 
0 0 0 0 1 0 0 


What is surprising is that the implicature is not enough, since the meaning of -(P 
^ Q) makes the implicature true when both disjuncts are false. Let us comment 
on this problem. 

Suppose that we must choose between two menus in a French restaurant. 
Menu 1 costs 30€ and includes fromage et dessert ‘cheese and dessert’. Menu 2 
costs only 25€ and mentions fromage ou dessert ‘cheese or dessert’. So, the choice 
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is simple. If you want both, you choose menu 1 and, if you want only one plate 
after the main course, you choose menu 2. Now you know two things by choosing 
menu 2: first, you cannot have both - this is the not (P and Q) implicature mean- 
ing — and, second, you can choose one item between cheese and dessert. Suppose 
now that you want a dessert after having chosen cheese in menu 2. The waiter 
will politely recall that fromage ou dessert does not mean both. If you agree and 
nevertheless want some dessert, then you should pay extra money for it. This 
seems perfectly reasonable, from a logical and pragmatic point of view. Now sup- 
pose that you want cheese and the waiter is embarrassed because there is no more 
cheese, only dessert. You can accept this but one constraint is not satisfied: even 
if the client cannot have both, he can choose one of the two and the restaurant 
must provide both. Finally, suppose that, after the waiter's negative answer 
about cheese, he answers again negatively in case you accept a dessert. In that 
case, you are right in having the impression that a maxim has been flouted. Nev- 
ertheless, exactly as in the second case, the strict truth-conditional interpretation 
makes this situation true of or-scalar implicature! 

Here is the main issue: what is implicated cannot encompass speaker mean- 
ing, since what the restaurateur wants to say (means) by writing on the menu 
fromage ou dessert is that only one item can be chosen, even though he is com- 
mitted to having both. In any case, the pragmatic meaning of fromage ou dessert 
cannot be compatible with neither cheese nor dessert. But this is what the truth 
conditions of not (cheese and dessert) predict. 

The solution, as shown in Table 2, comes from conjoining the or-inclusive 
meaning with its implicature. In this case, the situation predicted when both 
propositions are false is ruled out. So, the correct and exclusive meaning, that is, 
the or-exclusive meaning, is obtained via the or-inclusive meaning conjoined 
with its implicature. 

Now, if we recall Grice's definition of conveyed meaning, we encounter the 
classic definition proposed by neo-Griceans (Horn 2004) of what is communi- 
cated: what is communicated is the addition of what is said and what is impli- 
cated. What is said here is the or-inclusion meaning, what is implicated is the 
negation of the strong alternative expression in the scale, that is, and. 

In sum, the or-case shows that the meaning of an implicature does not ex- 
haust speaker meaning, which raises the question why we must compute impli- 
catures. The answer is that by doing so, we access speaker's meaning indirectly 
(see Moeschler 2017b, 2017c for a complete proposal about scalar implicature of 
quantifiers and speaker meaning). 
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2.3 Ilf and the issue of counterfactuals 


The third issue is linked to conditional if. Since Geis and Zwicky's (1971) influen- 
tial paper On invited inferences, the pragmatic analysis of conditionals has con- 
verged on the biconditional interpretation of ordinary conditionals (see also de 
Cornulier 1985 for an extensive analysis of French si; van der Auwera 1997a, 1997b 
for a general discussion of conditional “perfection”). In other words, ordinary 
conditionals lead to the biconditional analysis, whose truth conditions are given 
in Table 3 and logical property in (11). 


Tab. 3: Truth conditions for conditional (>) and biconditional (<>) 


P Q P>Q PoQ 
1 1 1 1 
1 0 0 0 
0 1 1 0 
0 0 1 1 


(11) if and only if P, then Q is equivalent to if P then Q and if Q then P 
P ©Q =q (P >Q) A(Q >P) 


The pragmatic explanation goes as follows: the use of a conditional in natural 
language implicates its biconditional meaning. In other words, what is excluded 
in the biconditional meaning is the situation where the antecedent is false and 
the consequent is true. This means that a conditional can be true in only two 
cases: first, when both antecedent and consequent are true and, second, when 
both are false. These two situations are, from a pragmatic point of view, quite 
plausible: when both the antecedent and the consequent are true, it makes sense 
that the conditional relation is true. Conversely, if there is a conditional link, this 
link is still the case when both propositions are false. These two cases are respec- 
tively represented by the suppositional use of conditionals, often called “ordi- 
nary conditional” (Moeschler and Reboul 2001) and the “counterfactual” use 
(Lewis 1973). 

So, from a strict truth-conditional perspective, both the ordinary and the 
counterfactual uses make sense of the truth conditions of the biconditional. Let 
us examine first the ordinary use in (12). 
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(12) If Peter comes to the party, Mary will be happy. 


If both the antecedent and the consequent are true in (12), then, if the antecedent 
is true, the consequent must be true. There are two possible explanations for this 
fact. The first one is the result of an invited inference: by asserting if P, Q, the 
speaker means or conversationally implicates if not-P, not-Q. Thus, the bicondi- 
tional meaning is, from a strict logical point of view, what the speaker conveys or 
communicates: what is said (if P, Q) plus what is implicated (if not-P, not-Q) 
equals what is communicated (if and only if P, Q). 

Of course, the implicature type of explanation should accommodate the can- 
cellation of the implicature. This seems to be correct, since (13) is not contradic- 
tory. In other words, the biconditional implicature can be cancelled without con- 
tradiction. 


(13) If Peter comes to the party, Mary will be happy but, if he does not come, Mary 
will be happy anyway. 


The second explanation is more pragmatic but it only covers a small part of con- 
ditional uses, that is, conditionals implying a causal relation. In effect, some or- 
dinary conditionals do not imply any causal relation between the antecedent and 
the consequent, as in (14). 


(14) If this is a triangle, then the sum of its angles equals 180°. 


In (14), the relation between the two propositions is that of a definition: the sum 
of the angles of a triangle is equal to 180° (see Blochowiak 2017 for more on defi- 
nitions in relation to connectives). 

In contrast, in (12), the relation is causal: Peter’s presence at the party will 
cause Mary’s happiness. This causal relation is between events and, more specif- 
ically, future events. The temporal operator FUTURE, in this case, scopes over 
each proposition and not over the conditional relation. So, (15) is the correct se- 
mantic interpretation but (16) is not. (16) is best translated as in (17) whereas, in 
French, the consequent is most frequently in the simple future, as in (18). 


(15) if FUTURE (Peter comes), then FUTURE (Mary is happy) 


(16) FUTURE (if Peter comes, then Mary is happy) 
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(17) In the future, if Peter comes, Mary will be happy. 


(18) Si Pierre vient à la réception, Marie sera heureuse. 


‘If Peter comes to the party, Mary will be happy.’ 


The causal interpretation is not restricted to ordinary conditionals. In the coun- 
terfactual one, as in (19), the causal relation is still the case but it concerns past 
counterfactuals events. 


(19) Si Pierre était venu, Marie aurait été heureuse. 


‘If Peter had come, Mary would have been happy.’ 


What about a causal past relation with no counterfactual interpretation? Appar- 
ently, the conditional connective cannot express such a relation. This does not 
mean that such relations cannot be expressed. A typical causal connective is be- 
cause (French parce que). So, the content causal interpretation of parce que may 
be a past content causal relation (Sweetser 1990), as in (20), as well as a present 
one as in (21), but not a future one as in (22), whose only possible reading is that 
of a speech act (Sweetser 1990), as in (23). 


(20) Marie était contente parce que Pierre est venu. 


‘Mary was happy because Peter came.’ 


(21) Marie est contente parce que Pierre est là. 


‘Mary is happy because Peter is there.’ 


(22) t Marie sera contente parce que Pierre sera là. 
‘Mary will be happy because Peter will be there.’ 


(23) ("affirme que) Marie sera contente, parce que Pierre sera là. 


‘(I affirme that) Mary will be happy because Peter will be there.’ 


So, ordinary conditionals express future causal relations, while causal connec- 
tives are devoted to past and present causal ones. What is more striking is the 
systematic use of irrealis tenses (like the French conditional and the English sub- 
junctive) for expressing counterfactual conditionals. French imparfait describes 
a counterfactual state or event in the present whereas the French plus-que-parfait 
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describes a past counterfactual state or event (Moeschler and Reboul 2001). The 
same holds, respectively, for the consequent clause, with the present conditional 
and the past conditional, as (24) to (27) show. 


(24) Si Marie était heureuse, elle nous le dirait. 


‘If Mary were happy, she would tell us.’ 


(25) Si Marie avait été heureuse, elle nous l’aurait dit. 


‘If Marie had been happy, she would have told us.’ 


(26) Si Pierre venait, Marie partirait avec lui. 


‘If Peter came, Mary would leave with him.’ 


(27) Si Pierre était venue, Marie serait partie avec lui. 


, 


‘If Peter had come, Mary would have left with him. 


2.4 Conclusion 


In sum, conditionals in natural language seem to be much more complex than 
involving simply a logical conditional relation (material implication) or the bi- 
conditional interpretation (equivalence relation). Moreover, the classic Gricean 
story seems to be rather poor as an explanation of the pragmatic enrichment of 
conditionals. 

The temptation is thus to abandon the logical description of logical connec- 
tives in natural language and to look for more basic cognitively motivated con- 
cepts for describing the uses of logical connectives. For instance, concepts like 
temporal succession and causality for and, doubt for or and supposition for if 
could be invoked. Although this path is often followed in cognitive and func- 
tional linguistics, I would like to propose a formalist explanation and come back 
to a very strong argument given by Gazdar (1979) for explaining truth-functional 
connectives in natural language. I will argue that his explanation can give some 
answers to the question of relations between logical connectives and semantic 
universals (von Fintel and Matthewson 2008). 
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3 The formalist approach 


To my knowledge, Gazdar's (1979) approach has not been developed in pragmat- 
ics (for an exception, see Moeschler and Reboul 1994: Ch. 6). However, he pro- 
poses a very convincing argument restricting truth-functional connectives (TFCs) 
to and and or. 


3.1 TFC 


There are sixteen possible TFCs, because the two arguments (propositions) are 
combined with four truth values: 1-1, 1-0, 0-1, 0-0. They are provided in Table 4 
(Gazdar 1979: 69; see Lohiniva 2014 for a full discussion) - the alphabetic labels 
are Gazdar's (1979). 


Tab. 4: Sixteen possible TFCs 


P Q A B C D E F G H | J K L M O V X 
1 1 1 1 1 0 1 0 0 1 1 0 1 0 0 0 1 0 
1 0 1 1 0 1 0 0 1 0 1 1 0 1 0 0 1 0 
0 1 1 0 1 1 0 1 0 1 0 1 0 0 1 0 1 0 
0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 


A corresponds to v (inclusive disjunction), B to < (material implication from Q 
to P), C to — (material implication), D to 7(-x), E to <> (biconditional), F to P, G 
to =Q, H to Q, I to P, J to 7 (exclusive disjunction), K to ^ (conjunction), L to » 
(5), M to € ( €, Oto Z (contradiction), V to 7 (tautology) and X to 4 (5). 

To make this system easier, Gazdar (1979) proposes to reduce the number of 
connectives to eight, by grouping the arguments to three sets of truth values: (1) 
= 1-1, {0,1} = 1-0 and 0-1, {0} = 0-0. Table 5 shows the new set of TFCs, that is, the 
relevant set for deciding which connective is a TFC. 


Tab. 5: Candidates for TFCs in natural language 


Arguments — A* D* E* J* K* o* ve X* 


{1} 1 0 1 0 1 0 1 0 
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Arguments A* D* E* J* K* o* v* x* 
{0,1} 1 1 0 1 0 0 1 0 
{0} 0 1 1 0 0 0 1 1 


Now, we must decide which connective is a TFC. O* (contradiction) is not relevant 
for natural language: it would mean that sentences in (28) receive the same truth 
value, that is, the value “false”, whether the sentences are true or false. It can 
thus be abandoned. 


(28) a. Geneva is an international city (1) O* (0) Bern is the capital of Switzerland 
(1). 
b. Geneva is not an international city (0) O* (0) Bern is the capital of Switzer- 
land (1). 
c. Geneva is not an international city (0) O* (0) Bern is not the capital of Swit- 
zerland (0). 


The same argument can be used for V* (tautology), because it would result in 
considering all sentences in (28) to be true. 

The criterion used by Gazdar (1979) is the principle of confessionality. This 
principle states that a TFC cannot yield a true proposition from false arguments. 
In other words, a TFC (C) must confess the falsity of its argument, by giving thus 
a false truth value to false arguments: 


(29) A connective c e C is confessional iff c({O}) = O 
(Gazdar 1979: 76) 


This criterion, which refers to the first maxim of Quality (*do not say what you 
believe to be false"), allows us to reject connectives D*, E* and X*, all yielding a 
truth value true 1 with false arguments 0. O* and V* have been put aside for rea- 
son of relevance, as discussed before. D* is not a classical connective in logic (7 
or +), but E* corresponds to the equivalence or biconditional connective (<, if 
and only if). So, the last candidates are A*, J* and K*. A* corresponds to inclusive 
or (inclusive disjunction), J* to exclusive or (exclusive disjunction) and K* to and 
(conjunction). As we demonstrated that exclusive or can be derived from inclu- 
sive or, the only possible TFCs in natural language are inclusive disjunction and 
conjunction, that is, or and and. These two connectives are predicted to be se- 
mantic universals. 
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So far so good, but what about negation and conditional? I would like to ar- 
gue that negation satisfies the same test as disjunction and conjunction and that 
conditional, being a non-confessional connective, must be explained at another 
level, that is, reasoning. Finally, if the principle of confessionality is the right 
pragmatic criterion, then it must follow that all connectives in natural language 
must be confessional. I will show in Section 5 that this is the case, at least for 
causal and concessive connectives. 


3.2 Negation 


Negation is a unary operator, because it has only one argument. There are four 
possible unary operators, because the truth values for the argument are com- 
bined (2°). They are provided in Table 6 (Gazdar 1979: 68). 


Tab. 6: Possible unary operators in natural language 


Argument T N P Q 
1 1 0 1 
0 0 1 1 


So, what has to be explained is why natural language only have N as an operator, 
that is, negation. 

First, T is eliminated by the submaxim of Manner (*be brief"): there is an 
equivalence between any proposition gand Té: Tg > ¢. 

Second, P and Q are eliminated because of the maxim of Relation (“be rele- 
vant”): whatever the truth value of gand y, Póis true and P yis true, which yields 
the equivalence between Pg and Py: Pø €» Py. The same reasoning holds for the 
operator Q, but with a false truth value: Qóis always false and Q vis always false, 
so Qg > Qy. Thus, only N is available for natural languages. Moreover, N can be 
used to define T: Tọ © NN9. 

In sum, negation is the only unary possible operator for natural language. 
This is good news, because it implies that a negative statement is the falsehood 
of the proposition over which the negation scopes. 
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3.3 Conditionals and reasoning 


The last issue regards conditionals, because material implication and the equiv- 
alence connective are not confessional TFCs - they yield a true proposition from 
false arguments. 

One argument against the principle of confessionality for maintaining condi- 
tional and equivalence connectives as TFCs is reasoning. In what follows, I will 
give a similar argument to Gazdar’s (1979) for disjunction: as the more specific 
connective can be obtained via implicature, only the broader one should be con- 
sidered as a TFC, that is, either the unilateral conditional or material implication 
(>or >). 

Contrary to other connectives, like conjunction, which gives rise to analytic 
deductive elimination rules (P A Q + P, P A Q + Q), deductive elimination rules 
for conditional are synthetic rules (Sperber and Wilson 1995): they yield a true 
conclusion from two premises, and not from one premise as with analytical rules 
(^-elimination rule). 

There are two deductive elimination rules for the conditional connective: mo- 
dus ponens and modus tollens, given in (30) and (31) respectively. 


(30) Modus ponens 


inputs (i) P>Q 
(ii) P 
output Q 


(31) Modus tollens 


inputs (i) P—Q 
(ii) -Q 
output -P 


These deductive rules contrast with the deductive schema of an invited inference, 
which leads to a logically false conclusion (see Mercier and Sperber 2017: 26 for 
an explicit discussion of conditional inferences), as in (32). 


(32) Invited inference 
inputs (i) P >Q 
(ii) =P 
output -Q 
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Imagine the following situation: Paul and Susan are concerned about Mary, be- 
cause Peter might be present or absent. Susan says if Peter comes, Mary will be 
happy. Two scenarios could happen. First, Peter comes and then Paul and Susan 
are relieved: Mary will be happy, unless Susan has said something false. So, Su- 
san and Paul’s reasoning assumes a true premise, yielding a true conclusion. In 
other words, what makes sense then of a conditional is that it is presumed to be 
true. Of course, the bet is the truth of the antecedent. 

What happens if the antecedent is false? In that case, the conclusion is prag- 
matically inferred as false. In our scenario, if Susan hears that Peter cannot come 
(he missed his plane), then she concludes that Mary will not be happy. However, 
there is absolutely no logical grounding for this conclusion. The conditional 
could be true in case the antecedent is false and the consequent true, as Table 3 
shows. 

Now, what about the counterfactual interpretation? Susan says if Peter had 
been there, Mary would have been happy. Here, neither the antecedent nor the 
consequent are true: the counterfactual interpretation is obtained because both 
propositions are supposed to be false. In that case, two ways of obtaining the 
counterfactual interpretation are possible. The first path is to use the invited in- 
ference schema: in that case, the falsehood of the antecedent implicates the false- 
hood of the consequent. The second path is using the modus tollens schema. Sup- 
pose Mary appears to be unhappy: in that context, the conditional plus the 
negation of the consequent leads to the negation of the antecedent, that is, Peter 
did not come. In other words, the counterfactual interpretation is obtained either 
by forward (invited inference) or backward (modus tollens) reasoning. 

So, if the inference schemas are incorporated in the semantics of if, then if 
can be a TFC even though it is not a confessional connective. 


3.4 Conclusion 


In conclusion, we have good reasons to define the set of TFCs as including con- 
junction, disjunction, conditional and negation. As they are defined truth-condi- 
tionally, this means that their semantics are minimal and can be captured by their 
logical meanings. All the possible pragmatic readings (temporal for and, exclu- 
sive for or, biconditional for if) are obtained by pragmatic inference. 

To have a complete picture, we should explain some uses of negation classi- 
fied as metalinguistic, because they are a priori non-truth-conditional. 
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4 Metalinguistic negation 


4.1 The classic approaches 


Metalinguistic negation is a use of negation where the speaker does not want to 
deny a positive utterance, as in descriptive negation, but refuses to assert a pre- 
vious assertion (Horn 1985, 1989), as in (33) to (41). 


(33) Around here we don't eat tom[eiDouz] and we don't get stressed out. We eat 
tom[a:touz] and we get a little tense now and then. 


(34) Mozart's sonatas weren't for violin and piano, they were for piano and violin. 
(35) I didn't manage to trap two mongeese: I managed to trap two mongooses. 
(36) Anne doesn't have three children, she has four. 

(37) You didn't eat some of the cookies, you ate all of them. 

(38) It isn't possible she'll win, it's downright certain she will. 

(39) John isn't patriotic or quixotic, he's both patriotic and quixotic. 

(40) Pm not happy — I’m ecstatic. 

(41) It's not warm out; is downright hot. 


In all these examples, there is no denial of a proposition but the speaker's refusal 
of asserting a proposition. In Horn's (1985, 1989) analysis, negation is unambig- 
uously truth-conditional but the use of negation in (33) to (41) is not truth-condi- 
tional: the speaker refuses to assert the proposition under the scope of negation. 

Let us take as a paradigmatic example (36) — for examples such as (33) to (35), 
see Moeschler (1997). Let us analyse (36) as follows (Moeschler 2013a): 
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(42) Anne doesn’t have three children (NEG), she has four (COR). 
a. COR — POS 


b. not (Anne has exactly three children) 


First, the corrective clause (COR) entails the positive counterpart of the negative 
clause (POS): in effect, if X has four children, then X has three children. Second, 
what does negation scope over? It cannot be POS, because this would imply that 
POS is under the scope of negation and simultaneously entailed by COR, which 
would lead to a contradiction, as shown in (43). 


(43) not (Anne has three children) and (Anne has three children) 


So, in (42), negation does not scope over the positive counterpart (POS) but over 
the implicature of POS (i.e., ‘Anne has no more than three children’). In that case, 
(42) is no longer contradictory, as (44) shows. 


(44) not (Anne has exactly three children) and (Anne has four children) 


Now we know why utterances of the form NEG, COR with metalinguistic negation 
are not contradictory. However, there is a surprising consequence of this analy- 
sis. 


4.2 The representational approach 


In some recent papers (Moeschler 2013a, 2017c), I demonstrated that metalinguis- 
tic negation, when it scopes over an implicature, has representational effects. In 
Moeschler (2010) and (20132), I use the same types of arguments to describe prop- 
ositional effects resulting from presuppositional negation, when negation scopes 
over the assertion and the presupposition: 


(45) Abi does not regret to have failed (NEG), because she passed (COR). 


In (45), she passed (COR) defeats both the assertion Abi regrets that P (POS) and 
its presupposition Abi failed. 

In both cases, metalinguistic negation scoping over an implicature and a pre- 
supposition, the contextual effects are representational: 
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(46) Contextual effect of metalinguistic negation scoping over a scalar implica- 
ture: 
strengthening of POS 


(47) Contextual effect of presuppositional negation: 
suppression of POS and POS presupposition 


In other words, both uses of metalinguistic negation entail and implicate repre- 
sentational effects. If this is true, then metalinguistic negation not-P means ‘the 
speaker cannot affirm that P’ and has representational effects. As a conclusion, 
metalinguistic uses are not special cases of non-truth-conditional meaning: on 
the contrary, they support a truth-conditional analysis. 


5 Discourse connectives as TFCs 


One of the main questions that pragmatics must answer is why natural languages 
display a great number of discourse connectives, such as causal, temporal and 
concessive ones. At a first glance, none of these connectives seems to exhibit a 
truth-conditional meaning. Even though a truth-conditional meaning could be 
partially relevant at the semantic level, the meanings of discourse connectives 
seem to focus on other properties. 

Let us take the but example. But has truth-conditional properties, like the 
conjunction and. One test for this type of meaning is the equivalence of truth- 
conditional meanings between P but Q and Q but P: 


(48) a. Paul is smart but lazy. 
b. Paul is lazy but smart. 


(48a) has the same truth conditions as (48b) but certainly not the same prag- 
matic meaning. (48a) is negatively oriented and leads to the conclusion that 
Paul’s laziness is a stronger argument than his smartness (Anscombre and Ducrot 
1977, 1983). On the other hand, (48b) is positively oriented: the sequences in (49) 
and (50) show that different conclusions are obtained (cf. Ducrot 1980 for the con- 
cept of argumentative orientation). 


(49) a. Paul is smart, but lazy: he is not the right person for the TA position. 
b. # Paul is smart, but lazy: he is the right person for the TA position. 
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(50) a. Paul is lazy, but smart: he is the right person for the TA position. 
b. # Paul is lazy, but smart: he is not the right person for the TA position. 


As a second property, both conjuncts with but must be true: all the combinations 
in (51) show that Paul is smart and Paul is lazy must be true if they appear in the 
sequence X but Y. 


(51) a. & Paul is smart but not lazy. 
b. # Paul is not smart but lazy. 
c. # Paul is not smart but not lazy. 


Which conclusions can we draw from these facts? Are they specific to but or do 
they, on the contrary, illustrate a general property of discourse connectives? I 
would like to make the following stipulation: all discourse connectives exhibit 
factivity, only a few of them being non-factive. 

Let us first discuss one possible counterexample. In French, puisque 'since' 
exhibits some interesting properties (Groupe 1-1 1975; Zufferey 2010, 2012). 

First, puisque can have a counterfactual use (Groupe A-1 1975). In (52), the 
speaker knows that his addressee does not know everything and is therefore un- 
able to give him the first three finishers. 


(52) Puisque tu sais tout, donne-moi le tiercé. 


‘Since you know everything, give me the first three finishers (in horse race).’ 


Second, and more generally, puisque introduces old or presupposed information, 
belonging to the common ground, as (53) shows. 


(53) Puisque tu es là, allons dans le salon. 


‘Since you are here, let us go to the living room.’ 


Other causal connectives, like parce que ‘because’, are factive: parce que cannot 
be used if the cause and the consequence are false (Blochowiak 2010, 2014; 
Moeschler 2016). Even when the causal relation is under the scope of negation, 
as in (54), negation does not scope over any propositions (presupposed as true) 
but over the causal relation, as in (55). 
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(54) Paul n'est pas tombé parce que Marie l'a poussé, mais parce qu'il s'est pris 
les pieds dans une racine. 
*Paul did not fall because Mary pushed him but because he walked on a 
root.’ 


(55) not (Mary pushed John CAUSE John fell) and (John walked on a root CAUSE 
John fell) 


Concessive connectives, on the other hand, defeat one specific inference, like 
mais ‘but’, pourtant ‘however’ and bien que ‘although’ (Lakoff 1971; Anscombre 
and Ducrot 1977; Moeschler 1989; Lohiniva 2017). For instance, the same conces- 
sive relation can be expressed by different connectives in French, as (56) to (58) 
show. 


(56) Il est républicain mais honnéte. 
‘He is a Republican but honest.’ 


(57) Il est républicain, pourtant il est honnéte. 


‘He is a Republican; he is honest, however.’ 


(58) Bien qu’il soit républicain, il est honnéte. 


‘Although he is a Republican, he is honest.’ 


In each example, a false inference contrasts with the second conjunct: the ex- 
pected inference is he is not honest. This is clearly an unexpected statement in the 
case of bien que: in bien que P, Q, Q is less likely to be the case than not-P (Loh- 
iniva 2017) — see the use of the French subjunctive with bien que. On the other 
hand, in the case of mais, the inference is forward (X is republican +> X is not 
honest) whereas, in the case of pourtant, the reading is either forward or back- 
ward (X is honest +> X is not a Republican). In each situation, both conjuncts are 
asserted as true, as the negation test in (59) and (60) shows for pourtant and bien 
que. 


(59) a. # He is not a Republican, pourtant he is honest. 
b. # He is a Republican, pourtant he is not honest. 
c. # He is not a Republican, pourtant he is not honest. 
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(60) a. # Bien que he is not a Republican, he is honest. 
b. # Bien que he is a Republican, he is not honest. 


c. # Bien que he is not a Republican, he is not honest. 


So, what is the difference in meaning between these quasi-synonymous connec- 
tives in French? Contrary to logical connectives, where the pragmatic meaning is 
the result of a narrowing of their logical meaning (enrichment), the semantic 
meaning of concessive cannot be extended from the logical conjunction mean- 
ing. 

In Moeschler (2016), I proposed a general answer to this puzzle, which is cru- 
cial for explaining why connectives are so pervasive in natural language. 

First, some connectives show a general conceptual relation, or a relational 
concept, which can be captured by concepts like CAUSE or CONTRAST. These 
concepts are assumed to be part of the conceptual meaning of connectives. These 
meanings are not truth-conditional because they cannot be expressed by one of 
the sixteen TFCs. Conceptual meaning includes not only a conceptual relation, 
like cause, but also all its possible entailments. For instance, in its causal uses, 
donc ‘therefore’ does not entail the consequence Q in P donc Q, because the 
speaker is responsible for the inference. Hence, donc is not a factive connective. 
A test for demonstrating the non-factivity of donc is the possible adjunction of an 
epistemic modal predicate in the consequence clause, as in (61). 


(61) a. Marie a poussé Jean, donc il est tombé. 
‘Mary pushed John donc he fell.’ 
b. Marie a poussé Jean, donc il a dû tomber. 
*Mary pushed John donc he must have fallen.' 


Second, beside its conceptual meaning, a connective has a procedural meaning, 
which is about, as regards causal connectives, the direction of the causal relation. 
What is striking is that French parce que is the only backward connective 
(Moeschler 2011), as in (62). In effect, when donc is backward, its use is not causal, 
but inferential: in P donc Q, the speaker's inference is about a possible cause Q, 
as in (63). 


(62) Jean est tombé parce que Marie l'a poussé. 
‘John fell because Mary pushed him.’ 
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(63) Jean est tombé, donc Marie l'a poussé. 


‘John fell, donc Mary pushed him.’ 


Finally, conceptual meaning can intervene at the level of explicature (parce que) 
or implicature (et, donc), this distinction being based on the cancellation test — 
implicatures are cancellable, explicatures are not. Table 7 gives a summary of 
such an analysis (Moeschler 2016: 134). 


Tab. 7: Achart for causal connectives 


Meaning Conceptual Procedural 
Connectives Entailment Explicature Implicature Direction of CAUSE 
parceque P Q CAUSE (X, Y) QP 

donc P POSSIBLE CAUSE (X,Y) P>Q 

et P Q POSSIBLE_CAUSE (X,Y) P>Q 


In sum, what makes discourse connectives specific in natural languages is not 
their non-truth-functionality but their conceptual and procedural meanings. The 
proposal made in Moeschler (2016) assumes that the slight meaning differences 
between quasi-synonymous connectives do not lie in the difference of conceptual 
or procedural meanings but in the way conceptual and procedural meanings are 
distributed within different layers of meanings. Broadly speaking, the meaning 
bricks of connectives are dispatched at different layers of meaning, such as en- 
tailment, explicature and implicature (Moeschler 2013b), as well as distributed 
over different conceptual/procedural drawers. 


6 Conclusion 


This article had the ambition to come back to a very classical analysis of TFCs in 
natural languages. Doing so, we obtained unexpected results. 

First, the formalist view of TFCs does not only explain why conjunction, dis- 
junction and negation are TFCs in natural language but also why a non-confes- 
sional connective like the conditional can join the restricted set of TFCs. 
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Second, a classical counterexample to a truth-conditional analysis of nega- 
tion is metalinguistic negation. I assumed in this paper that, even in metalinguis- 
tic uses, negation has representational cognitive, truth-conditional effects. 

Third, I have given an argument explaining why natural languages have a 
large set of discourse connectives. I first showed that, first, almost all of them 
exhibit truth-conditional properties and, second, even if their meaning is non- 
truth-conditional, it is conceptual and/or procedural and distributed within dif- 
ferent layers of meanings, such as entailment, explicatures and implicatures. 

Only few explicit proposals have been made in this direction. A new research 
program, capitalizing on a large set of descriptions of connectives in different 
languages, should emerge to answer positively to new research questions, such 
as why some connectives of different languages are at the same time so close and 
so remote in meaning. 
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Vladimir Plungian 
Notes on Eastern Armenian verbal 
paradigms 


“Temporal mobility” and perfective stems 


Abstract: The paper discusses two “hidden” semantic oppositions in the Arme- 
nian verbal system: both have no specific segmental markers but are manifested 
in the division of verbal forms into certain formal classes. In the first case, we deal 
with the the division into synthetic and periphrastic forms, which corresponds to 
the expression of the so-called "temporal mobility" (or the ability to express the 
opposition between present and past). In the second case, it is the morphological 
opposition between the basic verbal stem and the stem with an alternation. The 
choice of the alternating stem is related to the perfective semantics of the verbal 
form, so that one can speak of a general aspectual opposition of perfective and 
imperfective sets of forms in Armenian (not isolated in traditional analysis). 


Keywords: Armenian, verbal inflection, tense, aspect 


1 Introduction 


The main focus of the present paper will be certain formal properties of verbal 
paradigms in Armenian, first of all those which may have special cross-linguistic 
relevance. To the best of our knowledge, these properties have not been dis- 
cussed in the specialized literature at any length (if at all). The main bulk of our 
material comes from the standard written language of the Republic of Armenia, 
i.e., Average East Armenian (cf. Vaux 1998). Hereafter, we will refer to it simply 
as “Armenian”, unless otherwise stated. Similar properties of other idioms of 
Modern Armenian (both Eastern and Western dialects) deserve a separate and a 
more detailed discussion, which is far beyond the scope of the present study. 
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Inflectional morphology of the Armenian verb has been extensively de- 
scribed in grammars and works of narrower perspective (e.g., Abrahamyan 1962; 
Agayan 1967; Abrahamyan, Pafnasyan and Ohanyan 1974; Minassian 1980; 
Kozintseva 1991, 1995a, 1995), so we can rely here on a body of facts that may be 
considered generally agreed upon. However, when describing an inflectional sys- 
tem, it is important to establish not only the size of the paradigm and the rules of 
form composition but also what may be called a grammatical interpretation of 
these forms. The latter is related to grammatical semantics rather than to inflec- 
tional morphology as such and is a far less studied area. What will be addressed 
below are mainly problems belonging to this domain. 

Firstly, we will briefly present the basic facts of verbal inflection (Section 2). 
We will then discuss the possibility to isolate, within the Armenian verbal para- 
digm, two “hidden” grammatical oppositions: “temporal mobility” (Section 3) 
and morphological marking of perfectivity (Section 4). 


2 The general configuration of the Armenian 
verbal paradigm 


One of the basic structural oppositions within the Armenian verbal paradigm is 
that of synthetic and analytic (periphrastic) forms. From a diachronic perspec- 
tive, it is also important that the majority of periphrastic forms (according to a 
widely attested cross-linguistic pattern) have a more recent origin and display a 
greater variability in dialects. The older group of synthetic forms (well-docu- 
mented already in Classical Armenian, which is known in its written form since 
the 5th century AD) has conserved, to a considerable degree, phonological shape 
and morphological structure but has undergone considerable semantic changes 
and migrated from the domain of aspectual-temporal forms of the indicative to 
the domain of non-indicative modality. An exception to this tendency is pre- 
sented by the form of the so-called aorist, which remained in the system of indic- 
ative forms, conserving entirely its synthetic character.’ Thus, the opposition of 


1 As Vaux (1998: 2) points out, “excepting the aorist, none of the classical formation remained 
in place”. This type of evolution of present and past forms of the indicative is well-attested cross- 
linguistically. Bybee, Perkins and Pagliuca (1994: 230-236), for instance, provide sufficient de- 
tail on it (Armenian data are also made mention of there). The authors suggest that non-indica- 
tive meanings of originally indicative forms arise out of their usage in subordinate clauses with 
modal semantics. 
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periphrastic and synthetic forms in Modern Eastern Armenian may be roughly 
characterized as that of the forms of indicative and non-indicative modality re- 
spectively. However, this preliminary characteristic needs further specification 
(see Section 3). 

Another important exception is a little group of four stative verbs (arZe- ‘cost’, 
gite- ‘know’, ka- ‘exist, be available’, une- ‘have’), as well as the closely adjacent 
irregular existential copula e- (also used as an auxiliary in the majority of peri- 
phrastic forms). This group has no periphrastic forms at all. To express the indic- 
ative, these verbs use the old synthetic forms (of which only the present and the 
imperfect are available in their paradigm). Thus, this is a typical “relic group", 
untouched by grammatical innovation — a very widespread phenomenon in the 
inflectional morphology of the world’s languages.’ Naturally, our further discus- 
sion will not concern these verbs. We are also not going to discuss in detail the 
morphological rules of affixation to the verbal stem (which vary within different 
conjugations and, besides, have a number of important exceptions pertaining to 
frequency verbs). Some notes on the formal structure of the paradigm essential 
for our subject will be made in due course. In particular, the choice of a necessary 
verbal stem is essential: different verbal forms may differ not only in affixes but 
also in the type of the stem. This will be dealt with in more detail in Section 4. 

Periphrastic forms consist of the copula e-, the locus of expression of tense 
and subject person/number, and the main converb marking aspect (with the help 
of various suffixes), as this is more closely connected with the semantics of the 
verbal stem. Although, from a grammatical point of view, converbs within peri- 
phrastic forms basically express aspectual oppositions, one should bear in mind 
that the actual range of meanings of periphrastic forms is somewhat wider and 
includes evidential and modal values as well. 

All in all, there exist four forms of aspectual (in the wide sense determined 
above) converbs and, respectively, four classes of periphrastic forms: imperfec- 
tive (in -um), perfect (in -el), resultative (in -ac), and destinative (in -lu). Each con- 
verb can combine with the present and past forms of the copula. Thus, all the four 
aspectual values listed above have a present and past series of personal forms. 


2 A parallel case is found, for example, in Modern Basque, where only several frequent verbs 
(including auxiliaries) have a synthetic conjugation while all the others display a periphrastic 
paradigm. From an areal perspective, it might be useful to draw a comparison with the Modern 
Persian verbal system, where a small group of stative verbs like ‘lie’, ‘stand’ and ‘sit’ demon- 
strates a narrowed set of forms, which also differ from most other verbs in their grammatical 
meaning (for more detail, see Rubincik 2001: 233-234). 
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Now, we briefly explain the choice of labels for periphrastic forms, since not 
all of them are readily established in the literature. The Armenian linguistic tra- 
dition (reflected, for example, in Dum-Tragut 2009) tends to use different termi- 
nology, somewhat apart from what is typically expected in a cross-linguistically 
oriented study. 

Imperfective forms express two main aspectual meanings, the progressive (to 
indicate the on-going activity) and the habitual. This type of polysemy (or “gram- 
matical cluster”) is typical of aspectual systems in different areas. In particular, 
it is characteristic of all Slavic languages, Greek, Latin and many others. The pre- 
sent imperfective (i.e., forms like grum é) is the most frequent form, which ex- 
presses the present tense as such. Traditional grammars usually call the past im- 
perfective (i.e., forms like grum ér) “imperfect”, which, in this case, is quite 
legitimate: the imperfect is commonly regarded as a past tense form combining 
aspectual meanings from both actual and habitual domains. 

The forms of the perfect and resultative express the present and past tense of 
the perfect and (subject) resultative respectively. The meaning of resultative as- 
pect is more specific and boils down to asserting the existence, at the moment of 
speech (or in some reference point in the past), of a “natural” (i.e., lexically infer- 
able) result of the situation. This form (diachronically later than the perfect) is 
mainly possible with telic processes and has a relatively weak degree of gram- 
maticalization in Modern Eastern Armenian (it is no accident that, in traditional 
descriptions, there are certain hesitations about its inclusion in the core inven- 
tory of grammatical forms). 

The perfect, apart from its central value of “current relevance” (whatever it 
should mean), is also used in evidential contexts to report events not witnessed 
by the speaker personally, i.e., to express an inferential or a reportative meaning 
- more or less in keeping with what is observed in a variety of Great Evidential 
Belt languages (including Iranian, Turkic, Kartvelian and many others). In West- 
ern Armenian, as compared to Eastern Armenian, the grammaticalization of re- 
sultative forms is more advanced. There, the form that etymologically corre- 
sponds to the Eastern Armenian resultative is used as the (generalized) perfect 
while a cognate of the Eastern Armenian perfect (the form with the suffix -er) is 
now a dedicated evidential marker. For a more detailed account of the Eastern 
and Western Armenian perfect-resultative distinction, see, for example, 
Kozintseva (1988, 2000) and Donabédian (1996, 2001).? 


3 Forms of the past perfect (like grel er) have their own range of uses, in many ways differing 
from that of the present perfect, and deserve a separate study. For a preliminary overview, see 
Kozintseva (1998) and Sitchinava (2013). 
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Finally, destinative forms denote a situation which, at the reference point, is 
considered by the speaker as bound to occur, mainly due to external circum- 
stances. The present destinative (forms like grelu é) is one of the functional equiv- 
alents of the future tense, yet with a strong tinge of modality (one may speak here 
of external deontic modality, according to van der Auwera and Plungian 1998). 
Some contexts where this form is used are reminiscent of what is now usually 
called “prospective aspect” but, in many respects, the Armenian destinative is 
not a typical prospective. For all intents and purposes, the meaning of the desti- 
native is virtually close to the aspectual semantic domain, approaching varieties 
of the prospective. 

Thus, despite somewhat different degrees of grammaticalization (which is 
the highest in the imperfective and the perfect and is lower in the destinative and 
especially in the resultative), the four series of periphrastic forms represent, in 
total, an orderly system of forms, a nucleus of the indicative paradigm. Besides, 
the system of the indicative also includes a synthetic aorist denoting, in full ac- 
cordance with its name, perfective situations referring to the past and having (in 
contrast with the perfect) no connection with the moment of speech (for a recent 
in-depth treatment, see Donabédian 2016). In certain contexts, aoristic forms may 
be construed as expressing (in opposition with the perfect) an additional eviden- 
tiality-related component, indicating that the speaker has personally witnessed 
the situation referred to. 

Morphologically, the formation of the aorist strongly differs from that of all 
other forms in the verbal paradigm. Aoristic forms are immediately distinguished 
from all other forms and may be easily identified. This is due to the fact that, ina 
given verbal form, the aorist is simultaneously marked several times. It always 
requires a special suffix, which may be of two types: in most verbs, it is the marker 
-c'i- (however, in the third person singular, before a zero subject marker, it has a 
reduced form -c')* while some (sometimes called *strong") verbs have a vocalic 
marker -a-. The aorist also has a special set of personal endings that differ from 
those in other synthetic paradigms of the singular (i.e., in the present and past 
subjunctive considered below). Plural endings are always the same. The strong- 
est distinction is shown by the form of the third person, in which the aorist is 
opposed not only to all other verbal forms but is characterized, in strong verbs, 
by a non-zero ending -v, unique for all verbal forms. On the surface, third person 
singular forms of a “weak” aorist with the suffix -c' and a zero ending (like grec‘ 


4 Special morphonological phenomena before the zero subject marker of the third singular are 
also characteristic of other forms of the Armenian verb, for instance, of the subjunctive (see be- 
low), so this allomorphic distribution is not unique. 
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from grel ‘write’) and of a “strong” aorist with the suffix -a- and the ending -v (like 
ankav from anknel ‘fall’) differ quite considerably not only from other verbal forms 
but also from each other. Finally, in many verbs, the aorist is formed with a spe- 
cial stem, which is frequently suppletive. 

All other synthetic forms belong to different non-indicative moods, which 
represent a rich system in Modern Eastern Armenian (for a traditional nomencla- 
ture, see also Dum-Tragut 2010). Non-indicative moods include, in the first place, 
the imperative, inherited from Classical Armenian and represented only by forms 
of the second person with special endings (in the first or third person, commands 
are expressed with the help of other moods). A core element within the system of 
non-indicative moods is the so-called subjunctive, with a wide range of functions. 
It is used both in dependent clauses and independent sentences with optative 
and directive semantics, as well as in the protasis of conditional constructions. 
The subjunctive present and past are morphologically distinguished and these 
are the forms that historically go back to the present and imperfect indicative in 
Classical Armenian (i.e., to indicative forms of the imperfective series, which 
were substituted, in Modern Armenian, by periphrastic forms with converbs in - 
um). 

The present and past subjunctive are formed with the help of special sets of 
personal endings (of the present and of the past respectively), which coincide in 
the plural. The forms of the past also have a suffixal marker -i- in all persons, 
except for the third singular. 

In general, the third person singular of the subjunctive present and past (and 
also of the aorist indicative) displays a number of morphological peculiarities. In 
the present (forms like gri or gna), it is natural to isolate a zero person/number 
subject marker, which, however, causes the transition of the thematic vowel -e in 
the final position of the word-form into -i (the thematic vowel -a is not affected by 
this type of alternation). This seems more coherent than the traditional interpre- 
tation of the element -i in forms like gri as a personal ending. 

Somewhat more difficult is the problem of a morphological interpretation of 
the past subjunctive third person singular form. It has a marker -r consisting of 
one phoneme (cf. the whole series of forms for the past subjunctive singular: first 
person gre-i, second person gre-i-r and third person gre-r). It would be most rea- 
sonable to believe that, in this case, one deals with a cumulative expression, by 
a phonologically indivisible marker, of past subjunctive and third person singu- 
lar. Remarkably, in the paradigm of the past subjunctive, there is already a non- 
cumulative zero person marker (first person singular) and a non-cumulative per- 
son marker -r (second person singular). Certainly, as an alternative solution, one 
could speak of a special marker of the third singular, also a zero one (like in other 
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forms), which, unlike all other zero forms, causes the appearance of a unique past 
subjunctive suffix -r. However, this interpretation is, for obvious reasons, much 
more cumbersome and artificial, since it requires too many arbitrary assump- 
tions. 

Furthermore, in the system of non-indicative moods, one can distinguish the 
so-called conditional mood, which is morphologically formed by the addition of 
the prefix k(a)- to the forms of the present and past subjunctive. In many modern 
dialects (including Western Armenian), it is this form (or its diachronic continu- 
ation) that occupies the niche of the imperfect in the indicative. Certainly, if we 
rely on its semantics, there are no special grounds to postulate a formation of the 
conditional “from” the subjunctive, as traditional practical grammars usually do: 
this form is used in the apodosis of conditional constructions and denotes a real 
or hypothetical consequence, as well as a probable future occurrence. 

We are not going to consider here the form of the debitive (like piti gri ‘must 
write’), often referred to in the grammars as the fourth non-indicative mood, since 
its meaning reduces to a combination of the meaning of the predicative invariable 
particle piti ‘is needed’ with the meaning of the subjunctive, with which this par- 
ticle is combined as a head predicate. Accordingly, the debitive construction 
practically does not differ from constructions with verbs of volition or command 
also requiring subjunctive marking on the dependent verb. So, in terms of both 
its meaning and form, it belongs to the domain of the subjunctive. 

Such is, in the most general lines, the structure of the Armenian verbal para- 
digm. The question is: what are the non-trivial consequences for a description of 
the grammatical semantics of Armenian verbal forms that one can draw from an- 
alyzing formal properties of this structure (if any)? Now, we proceed to discuss 
this matter. 


3 Periphrastic and synthetic forms 


As has already been said, one of the main structural oppositions within the para- 
digm formed in the course of the transition from Middle to Modern Armenian is 
the opposition of periphrastic and synthetic forms. Thus, a natural question 
arises, viz., whether there is some semantic difference behind that formal oppo- 
sition, which is so important in the Armenian verbal system. Our answer would 
be positive but defining this semantic opposition in a clear and unambiguous 
way is not a simple task. 

At first sight, it seems that periphrastic forms are connected with the indica- 
tive while synthetic ones are connected with non-indicative moods (somewhat 
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similar statements have been made explicitly or, more often, implicitly in many 
traditional descriptions). However, the problem with this opposition lies in the 
fact that the notion of non-indicative mood itself has no positive content. Rather, 
it denotes a class formed on the basis of a negative principle. Generally speaking, 
for a typologically oriented description, an appeal to such purely “structural” 
classes is not very informative. 

Let us look more carefully at the two classes of forms within the Armenian 
verbal paradigm. Their names are shown in Table 1. 


Tab. 1: Periphrastic and synthetic forms of the Armenian verb 


Periphrastic forms Synthetic forms 

imperfective (present and past) aorist 

perfect (present and past) imperative 

resultative (present and past) subjunctive (present and past) 
destinative (present and past) conditional (present and past) 


The system presented in Table 1 is rather interesting. It can be seen that the divi- 
sion in two classes is connected neither with the opposition of the indicative and 
non-indicative moods nor with the opposition of diachronically “new” and “old” 
forms. The former opposition is contradicted, on the one hand, by the existence 
of a synthetic form of the aorist indicative and, on the other hand, by the presence 
of a modal component in perfect forms and particularly in destinative ones — 
while the latter opposition is contradicted, for instance, by the periphrastic char- 
acter of the perfect as attested as early as in Classical Armenian. 

The closest correlation between the opposition of synthetic and periphrastic 
forms seems to be related to the expression of the category of tense. It is primarily 
tense that is expressed by periphrastic markers (through auxiliaries) — and, 
hence, the verbal forms that allow tense oppositions are periphrastic. They may 
be called *temporally mobile". On the other hand, we find synthetic forms either 
among those without any temporal reference at all (as in non-indicative moods, 
whose “tense” forms, as is well-known, are not related to time) or with a fixed 
temporal reference (as with the imperative and aorist, rigidly connected with the 
future and past respectively). Accordingly, verbal forms with a fixed tense refer- 
ence have no separate morphological tense marker: their temporal reference is 
expressed cumulatively, together with aspect or mood. 

Thus, in a sense, one may say that the Armenian verb expresses formally a 
highly specific category of temporal mobility. It contrasts the forms allowing the 
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opposition of both present and past events and the forms disallowing a change 
of temporal reference (or, in general, having no such reference at all). Interest- 
ingly, it is the capacity to denote present time situations that appears to be a cru- 
cial semantic criterion for entering the class of temporally mobile forms: when 
such a capacity exists, the verbal form also has a past reference. However, if a 
present time interpretation of the verbal form is impossible, the temporal mobil- 
ity is obviously absent. One can roughly identify this feature with so-called actu- 
ality, i.e., with the capacity to refer to situations occurring directly at the moment 
of speech.’ In this case, the distribution of periphrastic and synthetic verbal forms 
may be formulated more easily: the categories that, in principle, can mark actual 
situations are expressed by periphrastic forms while the categories excluding, for 
some reason, an actual interpretation have a synthetic expression. 

Once again, although we see the opposition of temporally mobile (actual) 
and non-actual verbal forms as rather specific and, so to speak, idiosyncratic, one 
typological parallel seems nonetheless useful here. It is the verbal category of re- 
ality status, which also consists of two values conventionally called “realis” and 
“irrealis”. The category of reality status usually divides the verbal system into two 
classes of forms. One of them qualifies a situation as belonging to the real world 
(the world of events that are actually occurring or took place in the past) while 
the other does not (for a more detailed discussion, see Fliott 2000; Plungian 
2005). 

The opposition of real and irreal forms can be based on different semantic 
strategies. Therefore, the size and the structure of real and irreal forms may not 
coincide cross-linguistically. Along with the forms whose interpretation in the 
languages that grammatically mark reality status is always identical, there exist 
forms that are marked as real in some languages and as irreal in others. Such are, 
for instance, forms of the imperative and habitual: they have properties of both 
classes of situations. Imperatives denote a situation which does not belong to the 
real world but is believed to occur after the moment of speech in all likelihood. 
On the other hand, habituals describe not a real situation but a kind of abstract 
property: it expresses the speaker's judgment about the world and not a specific 
situation observed. This ambiguity can explain the not infrequent existence of 
imperatives with realis marking and habituals with irrealis marking. 


5 Theterm "actuality" is also often used in another sense, i.e., to denote the capacity of a verbal 
form to refer to a definite temporal interval (which does not necessarily include the moment of 
speech). However, the term “temporal localization" (used, for instance, in Kozintseva 1991) is 
more preferable to denote this sense. Notably, the Armenian aorist has the property of temporal 
localization (but not actuality, as we understand it here). 
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Hence, a comparison of the category of temporal mobility with that of reality 
status may be of interest, since the strategies of ascribing these two categories to 
verbal forms look similar. This similarity becomes particularly striking if we con- 
sider the role of actuality in the choice of both categories. Recall that the tempo- 
rally mobile forms are exactly those verbal categories that are able to express the 
semantics of actuality. However, actuality is one of the most crucial factors that 
determine the marking of a given verbal form as real (see also Plungian 2005 for 
more discussion). Thus, generally speaking, the category of temporal mobility 
may be considered a non-conventional variant of the category of reality status. It 
is non-conventional both at the level of content (since it opposes forms allowing 
and disallowing actual reference) and at the level of expression (since it opposes 
periphrastic and synthetic verb forms). 


4 Two verbal stems 


The second salient opposition in the Armenian verbal paradigm is the distinction 
between two types of verb stems. As already noted in Section 2, verbal stems may 
differ not only with regard to specific suffixal or prefixal grammatical markers or 
with regard to sets of person/number endings (providing subject person/number 
marking) but also with regard to the type of the stem itself. 

Traditionally, Armenian grammars distinguish two types of conjugation de- 
pending on the stem-final thematic vowel (-e- or -a-). The information about the 
conjugation type is necessary for building a wide range of grammatical forms 
which have a different shape in each conjugation, such as the singular and plural 
imperative and the resultative and perfective converbs. In particular, the the- 
matic vowels behave differently before suffixes with a vocalic initial: the vowel 
/e/ is truncated while the vowel /a/ requires, as a rule, a consonant augment -c’-. 
Some forms use different markers in different conjugations, such as the forms of 
the imperative singular ktr-ir ‘cut!’ from the stem ktr-e- and xaga ‘play!’ (with a 
zero marker) from the stem xag-a-. 

In both conjugation types, one can additionally distinguish “simple” stems 
and stems with suffixal *extensions". These extensions include the following el- 
ements: -n(e)-, -ć (e)-, -an(a)- and -en(a)-, as well as the causative suffix -c‘n(e)- 


6 The term "stem extension" is more preferable than "suffix" because one can rarely ascribe an 
independent meaning to these elements (though causative suffixes, for instance, also belong to 
the class of extenders). On top of that, not all verbal suffixes can determine the type of conjuga- 
tion. 
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(in parentheses, we indicate the thematic vowel immediately following the exten- 
sion). One can see that each extension unambiguously determines the choice of 
the thematic vowel. Stems with extensions are found among verbs of both conju- 
gations. 

It is stems with extensions that have an additional morphonological peculi- 
arity in Armenian: they exist in two variants forming the so-called “two-stem con- 
jugation”.’ The forms of the aorist, perfect, resultative, and imperative display a 
variant with alternation while other forms show a basic variant (the forms of the 
converbs of simultaneity behave in a particular way, see below). The alternation 
involves a simple truncation (in the case of a one-phoneme extension) or the re- 
placement of the phoneme /n/ with the phoneme /r/ (in the causative suffix) or 
with the phoneme /c*/ (in other longer suffixes). Consider the building of resulta- 
tive converbs: ank-n-el ‘fall’ and ank-ac, mot-en-al ‘come closer, approach’ and 
mot-ec -ac, mot-ec‘n-el ‘bring closer’ and mot-ec r-ac. 

Thus, in verbs with extensions (as well as in suppletive verbs close to them, 
of the type dnel ‘put’), one can single out two stems with their distribution de- 
pending on the grammatical meaning of the corresponding verbal form: the basic 
stem is maintained in the imperfective, destinative, subjunctive and conditional 
(as well as in the infinitive and in the converb of simultaneity) while a stem with 
alternation appears in the aorist, perfect, resultative and imperative. Semanti- 
cally, this division seems to be sufficiently transparent: overall, it corresponds to 
the (aspectual) opposition of imperfective versus perfective forms. 

The only exception to this interpretation may be the participle of simultaneity 
(or the “subject” participle) with the marker -og. Forms of this participle have dif- 
ferent structures in different conjugations. All verbs with the thematic marker -e- 
use a basic stem (e.g., ank-n-og ‘falling’) while verbs with the thematic marker -a- 
use an alternating stem (e.g., mot-ec -o$ ‘approaching’ instead of the expected 
*mot-ena-c'og). Most probably, forms like motec'og emerged under the influence 
of participles from verbs of the a-conjugation without extension (like xaga-c'og 
‘playing’), where the suffix -c'og is an allomorph of the marker -og, which regu- 
larly appears after the thematic vowel -a-. Thus, the deviant behavior of the par- 
ticiples of simultaneity (which are imperfective from a semantic point of view) 
may be explained by morphological contamination. 

To sum up, the opposition of two stem types, though not entirely systematic 
and not applying to all verbal lexemes, expresses one more hidden category of 


7 A number of irregular verbs have a two-stem conjugation. These are verbs whose stems are 
suppletive or connected by irregular alternations: ta-/tve- ‘give’, dne-/dre- ‘put’, ga-/ek- ‘come’, 
line-/ege- ‘be’ and some others. 


244 — Vladimir Plungian 


the Armenian verb: the aspectual category of perfectivity. The value of perfectiv- 
ity is attributed to all resultative verbal forms, to the aorist (in keeping with 
Donabédian 2016’s analysis) and to the imperative, the latter deserving special 
attention. An interesting typological feature of the Armenian imperative appears 
to be its “default” perfective interpretation. Cross-linguistically, this is not 
unique: the imperative tends to suggest a kind of completed event. 


5 Conclusion 


We have considered two “hidden” semantic oppositions in the Armenian verbal 
system. They are hidden in the sense that they have no specific segmental mark- 
ers but are manifested in the division of verbal forms into certain formal classes. 

In the first case, the role of a formal correlate is played by the division into 
synthetic and periphrastic forms, which, we believe, corresponds to the division 
of all Armenian verbal forms into those that are able to express the opposition of 
the present and past (“temporally mobile”) and those that either do not express 
the category of tense at all or do not oppose different tenses. Typologically, the 
category of temporal mobility seems to be rather idiosyncratic but it is reminis- 
cent of that of reality status (with the binary distinction of realis and irrealis). 

In the second case, it is the opposition of the basic stem and the alternating 
one that plays the role of a formal correlate. Alternation is conditioned by some 
grammatical elements in the verbal form, so a meaningful division of verbal 
forms also takes place here. In our view, the choice of the alternating stem is re- 
lated to the perfective semantics of the verbal form and one can speak of a general 
aspectual opposition of perfective and imperfective sets of forms in Armenian. 


Acknowledgement: For Johan, in remembrance of many wonderful moments in 
Antwerp and Brussels, and with verbal categories as a familiar background. 


References 


Abrahamyan, A. 1962. Baya Zamanakakic' hayerenum [The verb in Modern Armenian]. Erevan: 
AAS Press. 

Abrahamyan, S., N. Pafnasyan & H. Ohanyan. 1974. Zamanakakic' hayoc‘ lezu [Modern Arme- 
nian Language], volume 2. Erevan: AAS Press. 

Agayan, E. 1967. Zamanakakic' hayereni holovuma ew xonarhuma [The declension and conju- 
gation of Modern Armenian]. Erevan: AAS Press. 


Notes on Eastern Armenian verbal paradigms —— 245 


Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect 
and modality in the languages of the world. Chicago: University of Chicago Press. 

Donabédian, Anaid. 1996. Pour une interprétation des différentes valeurs du médiatif en armé- 
nien occidental. In Zlatka Guentchéva (ed.), L'énonciation médiatisée, 87-108. Paris: Pee- 
ters. 

Donabédian, Anaid. 2001. Towards a semasiological account of evidentials: An enunciative ap- 
proach of -er in Modern Western Armenian. Journal of Pragmatics 33 (3). 421-442. 

Donabédian, Anaid. 2016. The aorist in Modern Armenian: Core value and contextual mean- 
ings. In Zlatka Guentchéva (ed.), Aspectuality and temporality: Descriptive and theoretical 
issues, 375-411. Amsterdam: John Benjamins. 

Dum-Tragut, Jasmine. 2009. Armenian. Amsterdam: John Benjamins. 

Dum-Tragut, Jasmine. 2010. Mood in Modern Eastern Armenian. In Bjórn Rothstein & Rolf 
Thieroff (eds.), Mood in the languages of Europe, 492-508. Amsterdam: John Benjamins. 

Eliott, Jennifer R. 2000. Realis and irrealis: Forms and concepts of the grammaticalisation of 
reality. Linguistic Typology 4 (1). 55-90. 

Kozintseva, Natalia A. 1988. Resultative, passive and perfect in Armenian. In Vladimir P. Ne- 
djalkov (ed.), Typology of resultative constructions, 449-468. Amsterdam: John Benja- 
mins. 

Kozintseva, Natalia A. 1991. Vremennaja lokalizovannost' dejstvija i ee svjazi s aspektual’nymi, 
modal’nymi i taksisnymi znacenijami [Temporal localization of action and its connections 
to aspectual, modal and taxis meanings]. Leningrad: Nauka. 

Kozintseva, Natalia A. 1995a. Modern Eastern Armenian. Munich: LINCOM Europa. 

Kozintseva, Natalia A. 1995b. The tense system of Modern Eastern Armenian. In Rolf Thieroff 
(ed.), Tense systems in European languages, volume 2, 277-297. Tübingen: Niemeyer. 

Kozintseva, Natalia A. 1998. Pluperfect in Armenian. In Marina Ju. Certkova (ed.), Tipologija 
vida: Problemy, poiski, reSenija [Typology of aspect: Problems, search, solutions], 207- 
219. Moscow: Jazyki russkoj kul'tury. 

Kozintseva, Natalia A. 2000. Perfect forms as a means of expressing evidentiality in Armenian. 
In Lars Johanson & Bo Utas (eds.), Evidentials: Turkic, Iranian and Neighbouring lan- 
guages, 401-417. Berlin: De Gruyter. 

Minassian, Martiros. 1980. Grammaire d'arménien oriental. Delmar, NY: Caravan Books. 

Plungian, Vladimir A. 2005. Irrealis and modality in Russian and in typological perspective. In: 
Bjórn Hansen & Peter Karlík (eds.). Modality in Slavonic languages: New perspectives, 
187-198. Munich: Sagner. 

Rubincik, Jurij A. 2001. Grammatika sovremennogo persidskogo literaturnogo jazyka [A gram- 
mar of Modern Standard Persian]. Moscow: Vostoćnaja Literatura. 

Sitchinava, Dmitry V. 2013. Tipologija pljuskvamperfekta. Slavjanskij pljuskvamperfekt [Typol- 
ogy of pluperfect. Slavic pluperfect]. Moscow: AST-Press. 

van der Auwera, Johan & Vladimir A. Plungian. 1998. Modality's semantic map. Linguistic Ty- 
pology 2 (1). 79-124. 

Vaux, Bert. 1998. The phonology of Armenian. Oxford: Clarendon Press. 


Jean-Christophe Verstraete 
‘Perhaps’ in Cape York Peninsula 


Ignoratives and verbs of visual perception in epistemic 
marking 


Abstract: This paper analyzes a pattern of epistemic marking that is found in sev- 
eral Paman (Pama-Nyungan) languages of Cape York Peninsula, in the north-east 
of Australia. Formally, the pattern consists of a marker that is identical to the im- 
perative form of a verb of visual perception, optionally accompanied by an igno- 
rative of the 'thing' category or another type of marker. Semantically, these ele- 
ments mark potential verification, i.e., a weak type of epistemic meaning. The 
pattern is interesting for two reasons. From a typological perspective, it adds to 
the inventory of direct lexical sources for epistemic modality that have been iden- 
tified in the literature. The paper examines the semantics of the pattern in more 
detail, showing that, at least in its origins, its meaning can be linked to an in- 
struction for verification marked by the imperative of visual perception, with the 
ignorative as a modal reinforcer. The pattern is also interesting from an areal per- 
spective, because it is attested in five languages from three different subgroups 
of Paman, which neighbor each other geographically and which are linked by 
recurrent patterns of personal multilingualism. The spread of the pattern rein- 
forces existing arguments for the identification of a small linguistic area centered 
on Princess Charlotte Bay and its hinterland, on the east coast of Cape York Pen- 
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1 Introduction 


This paper analyzes a pattern of epistemic marking that is found in several Paman 
(Pama-Nyungan) languages of Cape York Peninsula, in the north-east of Aus- 
tralia. The pattern is illustrated in the Umpithamu structure in (1), where the com- 
bination of ngaani and ngamal serves to mark epistemic possibility. 


(1) Umpithamu (Pama-Nyungan, Paman; Middle Paman)! 
Yupa miintha iluwa ngaani ngama-l 
today good 3SG.NOM IGNOR see-IMP 
‘Perhaps she is better today.’ 


This pattern is interesting for two reasons. On the one hand, the markers involved 
can very easily be related to their lexical sources, as reflected in the glosses in (1). 
Ngaani is identical to an ignorative of the ‘thing’ category, basically a marker of 
lack of knowledge that can be glossed as ‘what’ or ‘something’ (see Section 3.1) 
while ngamal is identical to the imperative form of the verb ngama- ‘see, look’. 
The literature on the development of modality has often presented epistemic 
modal markers as “highly grammaticized” (Bybee, Perkins and Pagliuca 1994: 
205) and has tended to focus on their origins in other, non-epistemic, modal 
markers (e.g., Goossens 1982; Traugott 1989; van der Auwera and Ammann 2013). 
However, more direct lexical sources for epistemic modals have also been identi- 
fied, like ‘happen’, ‘seem’, ‘befall’, ‘I don’t know’ or ‘think’ (see Bybee, Perkins 
and Pagliuca 1994: 206; van der Auwera and Plungian 1998: 92; Boyer and Harder 
2007). The pattern illustrated in (1) partly overlaps with one of these but also adds 
a new one. Given the remarkable transparency of the pattern, it is worth investi- 
gating in more detail how exactly its epistemic meaning relates to the meanings 
of its lexical sources. 

On the other hand, this type of epistemic marking is found in a clear areal 
pattern. Markers that can be traced back to verbs of seeing, in combination with 
ignoratives or other elements, are attested in five of the about 40 different lan- 
guages of Cape York Peninsula. These five languages belong to three different 
subgroups of Paman but they neighbor each other geographically, they are linked 
by recurrent patterns of personal multilingualism and there is good ethnographic 


1 The following abbreviations will be used here: 1,2,3 first, second and third person; ACC accu- 
sative; APP apprehensive; DAT dative; DEM demonstrative; DU dual; Exc exclusive; FUT future; GEN 
genitive; IGNOR ignorative; IMP imperative; Loc locative; NFUTnon-future; NOM nominative; PL 
plural; POT potential; PRS present; PST past; RM remote; sc singular. 
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evidence for strong social links between the clans owning the languages (see 
Rigsby 1997; Verstraete 2012). In other words, there is good evidence for a small 
linguistic area here, of which the modal marker illustrated in (1) is just one reflec- 
tion. 

In this paper, I address both the semantics of the pattern of epistemic mark- 
ing illustrated in (1) and its distribution in Cape York Peninsula as part of a lin- 
guistic area. In Section 2, I provide a morphosyntactic description of the pattern 
in Umpithamu, the best-documented language in which it is attested. In Section 
3, I use these data to examine the semantics of the pattern in more detail, espe- 
cially in relation to earlier analyses of ignoratives as markers of epistemic status 
(e.g., Mushin 1995) and the available literature on the development of verbs of 
visual perception (e.g., Van Olmen 2010; Takahashi 2012). I argue that, at least in 
its origins, the semantics of the pattern can be linked to an instruction for verifi- 
cation marked by the imperative of visual perception, with the ignorative as a 
modal reinforcer (confirming a combinatorial possibility identified in Boye 2012: 
258-260). In Section 4, I examine the spread of the pattern in Cape York Penin- 
sula, using a survey of modal marking in the available grammars for the region. I 
argue that the spread coincides nicely with the linguistic area identified with the 
Princess Charlotte Bay region, for which we have other evidence of Sprachbund 
phenomena, and I hypothesize that the marker itself arose in the Lamalamic sub- 
group of Paman languages. I also discuss interaction with other strategies for ep- 
istemic marking found in Cape York Peninsula and, specifically, developments 
from apprehensional markers, which, in at least one language, intermeshes with 
the pattern identified in (1). 


2 Epistemic marking in Umpithamu 


As a first step in the argument, I describe the pattern illustrated in (1) in some 
more detail. The pattern is attested in five languages of Cape York Peninsula but, 
in this section, I focus on Umpithamu, the best-documented of the five. Examples 
from the other languages, viz., Umbuygamu, Lamalama, Kuku Thaypan and 
Aghu Tharrnggala, can be found in Section 4. Unless otherwise marked, data for 
Umpithamu (as well as for Umbuygamu and Lamalama) are taken from my own 
field notes. 

Umpithamu is a Pama-Nyungan language of the east coast of Cape York Pen- 
insula, at the northern end of Princess Charlotte Bay (see Verstraete 2012 for more 
details). Genetically, it belongs to the Middle Paman subgroup of Paman lan- 
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guages (see Verstraete and Rigsby 2015: 173-194). Like most Pama-Nyungan lan- 
guages, Umpithamu has a number of markers with modal values in its paradigm 
of verb suffixes. For instance, there is a counterfactual marker -rra for events that 
could or should have taken place but did not, and a potential marker -ku for 
events that are likely, weakly desired or simply located in the future (as well as 
an imperative marker -/). In addition, there is a slot right before the verb that can 
take negation markers, and a specialized apprehensive marker, which designates 
that an event is likely but undesirable. Verstraete (2011a) provides a more detailed 
analysis of these markers and their semantic values. 

None of these features is remarkable for a Pama-Nyungan language, but the 
pattern in (1) is remarkable, both within Pama-Nyungan and in a broader typo- 
logical perspective. There are quite a few Pama-Nyungan languages that have 
one or more epistemic particles (see Section 4 for some examples) but there are 
very few cases where they can be linked back to specific lexical sources, with such 
transparency. As already mentioned, the epistemic pattern in Umpithamu can be 
related to ngaani, an ignorative of the ‘thing’ category that can be glossed as 
‘what’ or ‘something’, and to ngamal, the imperative of a verb stem glossed as 
‘look, see’. The basic ignorative use of ngaani is illustrated in the question-answer 
sequence in (2) while the basic verbal use of ngamal is illustrated in (3). 


(2) A: Amiya, ngaani ngaympi-n=inu, pigipigi? 
mother IGNOR hit-PST=2SG.NOM pig 
‘Mum, what did you get, a pig?’ 
B: Minya murrkan ngaympi-n=ayuwa 
meat.animal fish hit-PST=1SG.NOM 


‘I got some fish.’ 


(3) Ngama-l=inuwa yenu 
see-IMP=2SG.NOM up 
‘You look up there [up in the tree].’ 


The example in (1) illustrates the maximal extent of the epistemic pattern, i.e., 
ngaani and ngamal combined, in clause-final position. The Umpithamu corpus 
also shows a few permutations, however, which provide some indications about 
the origins of the pattern, as will be argued in Section 3.3. First, ngaani and nga- 
mal can both occur independently with epistemic meanings, most typically so for 
ngamal. Ngamal occurs on its own quite frequently, without ngaani, as illustrated 
in (4). When it does, it is always clause-final. Ngaani can also occur without nga- 
mal, as shown in (5), but much less frequently. In such uses, it usually takes 
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clause-initial position. There is no obvious semantic difference between ngaani 
and ngamal combined, or used independently, but, in Sections 3.1 and 3.3, I will 
argue that the greater ease for ngamal to be used independently suggests that 
ngamal may be the primary modal marker in the combination, with ngaani hav- 
ing arisen as a reinforcer. 


(4) Yupa uyngka-n=ilu ngama-l 
today break-PST=3SG.NOM see-IMP 
*Perhaps, nowadays, it's broken.' [discussing the current state of a particu- 
lar rock feature] 


(5) Ngaani miintha iluwa 
IGNOR good 3SG.NOM 
‘Perhaps she is better now.’ 


A second type of permutation found in the corpus concerns the position of the 
markers. While the combination of ngaani and ngamal typically occurs clause- 
finally, it is also found more rarely split over the clause, as in (6), with ngaani and 
ngamal in their typical initial and final positions respectively. Again, there is no 
discernible semantic difference between the two options but, as I will argue in 
Section 3.3, this type of variation may offer some clue to the development of 
ngaani ngamal. Finally, there is also one attestation of ngaani ngamal in initial 
position, in the 'disjunctive' use illustrated in (9) at the end of this section. The 
existence of these variants shows that we are really dealing with a set of markers 
in a range of patterns rather than one single pattern. In what follows, I will refer 
to the whole set as ngaani/ngamal. 


(6) ngaani iya-ku-ayu ngama-l 
IGNOR — gO-POT-1SG.NOM see-IMP 
‘Perhaps I will go.’ 


The semantics of this pattern can be described broadly in terms of epistemic pos- 
sibility, usually glossed by speakers as mait or maitbe, which is a typical possi- 
bility marker in local creoles and forms of Aboriginal English (see Crowley and 
Rigsby 1979: 192 on Cape York Creole; Schultze-Berndt and Angelo 2013 on paral- 
lel forms in Kriol). At first sight, this may seem to overlap with the meaning of the 
verbal suffix -ku, which marks potential realization of an event and often co-oc- 
curs with ngaani/ngamal, as in (6) above. There is an important difference, how- 
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ever. The feature of possibility marked by ngaani/ngamal does not refer to poten- 
tial realization of an event but to potential verification — in other words, it is ex- 
clusively propositional in scope. Ngaani/ngamal is not only used with potential- 
marked verbs but also with verbs in the past or the present or with non-verbal 
predicates (which are implicitly present), as shown in (7) and (8). In such cases, 
potentiality does not relate to the occurrence of the event being described (which 
is marked as preceding or coinciding with the moment of speaking) but to verifi- 
cation of the speaker's claim about the event. The structure in (7), for instance, 
follows a lengthy description of an animal behaving strangely and states that this 
may be a sign: the predicate in (7) describes what may have happened while the 
animal was acting strangely, and ngaani/ngamal marks that this is subject to fu- 
ture verification. With potential-marked verbs, by contrast, both the realization 
of the event and the verification of claims about the event are potential. 


(7 Omoro ingkuna ngaani wuypu-n ngama-l 
father 2SG.GEN IGNOR  die-PST  see-IMP 
‘Perhaps your father has died.’ 


(8) Kaantyu  ngama-l 
kaantyu  see-IMP 
‘It might be a Kaanju person.’ 


In addition to the basic epistemic modal function of ngaani/ngamal, there is also 
one extended use, illustrated in (9). As in many languages, markers of epistemic 
possibility can also be used to convey a relation of disjunction between alterna- 
tives (see Mauri 2008). Thus, the two tokens in (9) are fully in line with the mean- 
ing of ngaani/ngamal as stated above but, in combination, they also serve to con- 
vey the existence of alternative interpretations of an event: the speaker is 
commenting on the fact that she has seen a man talking to a woman and offers 
two alternative interpretations of his motivations. As mentioned above, this use 
seems to be associated with a different position for ngaani/ngamal. There are not 
enough examples in the corpus to check if this is more than just a pragmatic strat- 
egy - the key would be to find examples that do not involve a feature of verifica- 
tion — but it is clearly in line with Mauri's (2008) findings about the semantic re- 
lation between epistemic modality and disjunction. 


(9) Ngaani ngama-l wompil-ku, ngaani yaapala-ku 
IGNOR  see-IMP  sweetheart-DAT IGNOR  talk-DAT 
‘Perhaps he wants to court her, perhaps just talk to her.’ 
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3 Ignoratives, visual perception and the 
epistemic domain 


Given that the sources of the pattern described here are so transparent, the next 
obvious question is how ignoratives and verbs of visual perception could come to 
function as markers in the epistemic domain. In this section, I first discuss how 
each of these two elements separately can be linked to epistemic modality and 
then I present a hypothesis about how they could have come to be combined in 
the pattern described in the previous section. 


3.1 Ignoratives and epistemic modality 


The use of ngaani in question-answer sequences like (4) suggests that it could 
simply be analyzed as a question word like English what. For most Australian 
languages, however, this is not an adequate analysis, and this is precisely why 
such forms are relevant to the epistemic domain. As shown by Mushin (1995), 
these forms can systematically be used both in questions and in statements about 
lack of knowledge, which suggests that they are not simply interrogatives. Fol- 
lowing Durie (1985) and McGregor (1990), Mushin (1995) argues that apparent in- 
terrogative forms in Australian languages usually have a basic meaning of lack 
of knowledge. In Umpithamu, for instance, ngaani can be used in questions, as 
in (10), and in statements, as in (11). What the two uses have in common is that 
ngaani signals a lack of knowledge. The distinction between "question" and 
“other” uses simply falls out from the context, i.e., whether the clausal context 
has other features that mark it as a request for information and whether this gets 
picked up by the interlocutor as the first turn of an adjacency pair. In (11), for 
instance, there is no indication in the first clause that it is to be taken as a request 
for information. Accordingly, the same speaker simply continues the turn, in this 
case by specifying what exactly was being made. 


(10) A: Minya ngaani? 
meat.animal IGNOR 
*What kind of animal?' 
B: Yaathantyi 
carpet.snake 
‘A carpet snake.’ 
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(11) Huwa | ngaani yula-n=iluwa. Wiingal yula-n=iluwa. 
3SG.NOM IGNOR make-PST=3SG.NOM boomerang make-PST=3SG.NOM 
‘He made something/I-don’t-know-what. He made a boomerang.’ 


This basic meaning shows some further extensions in Umpithamu, again in line 
with the general paths of development proposed in Mushin’s (1995) typological 
study. On the one hand, the ignorative can also be used as a hesitation marker, 
as shown in (12), where ngaani signals a brief word search that is immediately 
resolved locally. On the other hand, it can also be used as a determiner-like ele- 
ment (whose precise semantics remains unclear), as in (13). 


(12) Ngaani, yeerra-mpal ungki-ngka-ilu-ungku 
IGNOR coffin-LOC — put-PRS-3SG.NOM-3SG.ACC 
‘He puts it into, what-do-you-call-it, a coffin.’ 


(13) Yukurun ngaani yitha-n=antyampa kuurra 
gear IGNOR  leave-PST=1PL.EXC.NOM behind 
‘We left some gear behind.’ 


Given that the basic meaning of apparent interrogative forms in Australian lan- 
guages relates to lack of knowledge, Mushin (1995) proposes to call them “epis- 
tememes”, following Durie (1985). In the context of this analysis, I prefer the al- 
ternative term “ignorative” because it highlights the feature of lack of knowledge. 
Regardless of terminology, however, Mushin’s analysis shows quite clearly why 
such markers could come to serve as epistemic elements. Their basic function is 
already in the broad domain of marking knowledge states — in this case, marking 
lack of knowledge about a specific entity. How, then, could a marker of lack of 
knowledge become part of a larger pattern for marking epistemic possibility as 
described in Section 2? I believe this involves two steps, one explained in this 
section and a second to be explained in Section 3.3. 

The first step is to consider the nature of the category targeted by the ignora- 
tive. As shown by Mushin (1995), Australian languages usually have a range of 
ignoratives for different categories, such as things, persons and places. In Um- 
pithamu, ngaani is the ignorative for the ‘thing’ category, which contrasts with 
wanthamu for the ‘person’ category, wanthawa for the ‘place’ category and 
angampal for the rest (mainly manner, time and quantity). ‘Things’ are the most 
obvious referent for ngaani, as shown in the examples above, where the marker 
consistently targets a discrete non-human entity. But these are not the only pos- 
sible targets. In Umpithamu, ngaani also serves as a more abstract ignorative, 
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marking a lack of knowledge about propositions and events rather than just en- 
tities. This is illustrated in (14), where ngaani does not mark lack of knowledge 
about a specific entity but about what is happening: the father’s response to the 
child’s ignorative is a full proposition describing an event rather than an entity. 
The structure in (15) illustrates a related use, also analyzed in Mushin (1995), 
where ngaani forms the basis for an ignorative of reason (‘why, what for’), with 
the dative marker -ku. Again, the target of this marker is not an entity but a prop- 
osition or event: the people being addressed are crying because they thought the 
speaker had died. 


(14) *Yoompi-l-inuwa, ^ yoompi-l!” 
stand-IMP=25G.NOM stand-IMP 


Yoompi-n=ayu "Ngaani omoro?” 
stand-PST=1SG.NOM IGNOR father 

“Anharra alu wuna-ngka=iluwa.” 
saltwater.crocodile DEM lie-PRS=3SG.NOM 


““Stop, stop!” I stopped, “Dad, what's going on?” "There's a saltwater croc- 
odile over there.” 


(15) Ngaani-ku mi'athi-ngka-uurra-athungku 
IGNOR-DAT  Cry-PRS-2PL.NOM-1SG.ACC 
*Why are you all crying for me?' 


Thus, ngaani in Umpithamu is also a more abstract type of ignorative, which can 
mark lack of knowledge about events rather than just non-human entities. If its 
basic meaning is to express lack of knowledge about events, this is not actually 
that far from signaling potential verification, the basic meaning of ngaani/ngamal 
as discussed in Section 2. A very similar argument is actually made by Boye (2012: 
24-27), who argues that these two meanings are different instantiations of what 
he calls *neutral support", the lowest value on a scale of epistemic strength. The 
two meanings are not entirely equivalent, of course, because expressing a lack of 
knowledge about an event need not imply a need for future verification. Con- 
versely, however, signaling potential verification does typically imply that one 
lacks knowledge about the event being discussed. In this sense, uses of ngaani 
targeting events or propositions are semantically close to the meaning of 
ngaani/ngamal, but not equivalent. I believe this is also the reason why ngaani is 
not the dominant partner in the set ngaani/ngamal (see also Sections 2 and 4). In 
the next section, I will argue that ngamal has a more directly epistemic meaning 


256 —— Jean-Christophe Verstraete 


and, in Section 3.3, I will round off the analysis by providing some tentative evi- 
dence that ngaani may have arisen as a reinforcer of ngamal, semantically com- 
patible with but not equivalent to the more basically epistemic marker ngamal. 


3.2 Verbs of seeing and epistemic modality 


At first sight, the use of verbs of seeing in epistemic marking may seem less sur- 
prising than that of ignoratives. There are well-known proposals about metaphor- 
ical links between the domains of vision and knowledge or understanding (most 
prominently, Sweetser 1990) and there is a rich literature about the grammatici- 
zation of imperatives of verbs of vision (e.g., Van Olmen 2010). Neither of these 
lines of argument can be used in a direct way to explain the epistemic meaning 
of ngamal in the ngaani/ngamal set, however. On the one hand, there is no evi- 
dence in Umpithamu (or in any of the other languages studied in this paper) that 
‘see’ verbs have secondary senses of knowing or understanding, which could 
then serve as a bridge to epistemic uses. This is in line with the more general ar- 
gument developed in Evans and Wilkins (2000) that, in Australian languages, the 
domain of understanding tends to be conceived in terms of hearing rather than 
vision,’ as also reflected in the Umpithamu structure in (16), where wisdom is de- 
scribed in terms of strong hearing rather than strong vision. On the other hand, 
the typical grammaticization targets of visual perception verbs are in the domains 
of information structure and expressivity (Van Olmen 2010), with argument-in- 
troducing uses like the English structure in (17) coming closest to the epistemic 
domain (see Van Olmen 2010). Again, such uses do not provide any parallels to 
ngamal as studied in this paper, since argumentative uses are epistemically much 
stronger than the possibility markers studied here. For instance, the use of look 
in (17) suggests a strong degree of epistemic commitment to both the argument 
and the conclusion that can be derived from it. 


(16) Omoro athuna wina wakara  iya-n-iluwa 
father 1SG.GEN ear strong go-PST-3SG.NOM 
‘My father was clever.’ 


2 In fact, the only exception described in Evans and Wilkins (2000) is in a number of languages 
of southern Cape York Peninsula, like Guugu Yimidhirr, where verbs of visual perception do 
have a sense of knowing (which may itself be due to a secondary development of seeing to hear- 
ing; see Evans and Wilkins 2000: 551). This pattern is not attested in any of the languages studied 
in this paper, however, and even if it was, a sense of knowing would be unlikely to develop into 
a weak epistemic marker like the one studied here. 
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(17) The end of the stage would be hard for any team to control, just look what 
happened to Movistar when Simon Yates won. 
(http: //www.ciclismointernacional.com/vuelta-a-espana-2016-stage-9- 
preview) 


How, then, could the imperative form of a verb of visual perception develop into 
an epistemic marker? The first point to note is that ngama- in Umpithamu is vague 
between an intentional sense (‘look’) and a non-intentional one (‘see’), as illus- 
trated in (18) and (19). This is a well-known pattern in Australian languages, at- 
tested for many verb meanings besides visual perception (see Dixon 2002: 57). 


(18) Ngama-n=ina-ingku, ngo’oyi 
See-PST-3PL.NOM-3SG.ACC nothing 
‘They looked at it, but nothing [it had disappeared].’ 


(19) Nhuwal ngama-n=ayu-ungku 
bubble ^ see-PST=1SG.NOM-3SG.ACC 
‘I saw a bubble.’ [speaker noticing a sign of an animal in the water] 


This distinction is relevant to the current discussion, because many of the gram- 
maticization targets for imperatives of visual perception described in the litera- 
ture are in fact derived from verbs of intentional perception. As shown by Van 
Olmen (2010), these tend to develop into strong markers that imply or even en- 
code relatively strong epistemic commitment on the part of the speaker, like at- 
tention-getters or argument-introducing markers like (17). This is precisely the 
type of development that is largely absent in Australian languages (see Evans and 
Wilkins 2000), and even if it were attested, it is unlikely as a source for weak ep- 
istemic markers like ngamal. 

The development of verbs of non-intentional visual perception is much less 
well-studied in the literature, but if we look at what is available, there are indica- 
tions that they tend to develop into weaker markers than their intentional coun- 
terparts, which do not imply or encode strong commitment on the part of the 
speaker. One path that has been reasonably well-studied is the development of 
verbs of seeing to conative meanings of trying (e.g., Voinov 2013), as illustrated 
in the Yimas structure in (20), where the conative marker is analyzed as deriving 
from a ‘see’ verb in a serial verb construction (Foley 1986: 152, as quoted in Voinov 
2013). Conative structures like these do not encode modal meanings as such, but 
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if they trigger any modal inferences, they will not imply strong modal commit- 
ment, given that the speaker conceives of the event as being attempted rather 
than realized. 


(20) Yimas (Lower Sepik-Ramu, Lower Sepik) 
Na-mpi-kwalca-tay-ntut 
3DU»3SG-arise-see-RM.PST 
*They both tried to wake him up.' 

(Foley 1986: 152) 


There is another path, however, that is not really well-described in the literature 
but more immediately relevant to modality. The key to this path lies in the inter- 
pretation of the imperative marker in ngamal. In principle, an imperative form is 
most easily compatible with the intentional sense, because imperatives entail 
some degree of control over the action being described: thus, ‘look!’ is more easily 
interpretable than ‘see!’. As just mentioned, however, grammaticized ‘look!’ 
forms typically imply or even encode strong epistemic commitment and are there- 
fore not a good candidate source for a weak epistemic marker. The alternative is 
the non-intentional sense, i.e., ‘see!’, but the question is how this could be inter- 
preted with an imperative marker. As argued in Jary and Kissine (2016), in such 
cases, imperatives generally tend to coerce intentional readings: for instance, the 
use of the imperative with the uncontrolled verb know in English, as in (21), co- 
erces an interpretation along the lines of ‘make sure you know the answer”. 


(21) Know the answer! 
(Jary and Kissine 2016: 143) 


Along the same lines, in the case of ‘see!’ forms, one cannot be ordered to perceive 
something but one can be ordered to be open to such perception, i.e., to be recep- 
tive to information that is not yet available. Thus, for instance, in English, see can 
be used in imperatives and other deontic forms addressing the interlocutor, as 
illustrated in (22). In all of these cases, see can be analyzed as an instruction to 
be receptive to future information in order to make a judgement or decision: this 
information may concern the further course of current events in (22a), the results 
of one's actions in (22b) or arguments for the validity of a proposal in (22c). 


(22) a. Wait and see before you make a judgement. 
(http://www.pbs.org/newshour/rundown/students-eastern-michigan- 
u-protest-kkk-racist-graffiti) 
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b. Set aside $25 for a test and see how you go. 
(http://www.blogtyrant.com/start-a-blog-2014) 

c. “But, you know, Terry, the St. Anne’s case is highly technical, and of course 
there’s a lot at stake. I think it calls for the judicial temperament and ex- 
pertise of someone who has experience in these difficult matters, someone 
like, say, Judge Irving Samuels.” “Let’s see.” Terrence consulted the court 
docket and turned a few pages. “Yes, Samuels. He’s sitting on criminal 
cases. But let me see what I can do.” 

(Takahashi 2012: 29) 


This type of use offers an interesting model to explain the contribution of ngamal 
to the ngaani/ngamal set. In the English examples in (22), the scope of the judge- 
ment is potential action: *be receptive to future information in order to decide 
whether you (or we) should do X’. If the scope is propositional, however, i.e., 
knowledge rather than action, an instruction to be receptive to future information 
amounts to potential verification: ‘be receptive to future information in order to 
decide if X is true'. In other words, if the model of structures like the ones in (22) 
is relevant, ngamal may have started its course toward epistemic meaning in 
ngaani/ngamal as a hedge, meaning something like '(let's) see if it’s true’. 

This type of analysis is tentative, of course, but it does account for some pe- 
culiarities of ngamal in ngaani/ngamal that are hard to deal with in alternative 
accounts, specifically: (i) the implausibility of the volitional sense of ngama- as a 
relevant source, (ii) the interpretation of the imperative with the non-volitional 
sense of ngama- and (iii) the specific meaning of potential verification associated 
with ngaani/ngamal. In addition, this analysis is also compatible with the few 
Australian cases for which extensions of visual perception into the domain of cog- 
nition have been observed: the data in Evans and Wilkins (2000: 575-576) sug- 
gest that if verbs of visual perception do develop meanings relating to the domain 
of cognition (e.g., meanings like ‘recognize’ or ‘deduce’), the relevant source is 
usually the non-intentional sense rather than the intentional one. 


3.3 Ignoratives and verbs of seeing combined 


The two previous sections have presented hypotheses about how ignoratives and 
verbs of seeing could have developed epistemic meanings: ignoratives mark lack 
of knowledge about events while ‘see’ imperatives may have originated as in- 
structions for verification. In this sense, the meaning of 'see' imperatives is se- 
mantically closest to the meaning of the ngaani/ngamal pattern as a whole, as 
also reflected in the fact that they are dominant overall. In this section, I present 
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a final hypothesis — more speculative than the other ones — about how ngaani 
and ngamal may have come to be combined. I suggest that ngaani may have orig- 
inated as a reinforcer of ngamal, through a discourse pattern that favors initial 
indeterminacy. The resulting pattern is in line with Boye's (2012: 257—274) typo- 
logy of epistemic combinations, representing a *harmonic" combination of two 
distinct instantiations of the lower value of the epistemic scale (“neutral sup- 
port"). 

In order to substantiate this hypothesis, I start out from three relevant obser- 
vations. One, just mentioned, is that ngaani is optional in the ngaani/ngamal pat- 
tern, with ngamal frequently occurring without ngaani. The second is that, when 
combined, ngaani varies between its typical position next to ngamal at the end of 
the clause and a rarer split pattern whereby ngaani is in initial and ngamal in final 
position, as shown in (23) and (24) (see also Section 2). 


(23) Errpe-n-ilu-ungku "Omoro ingkuna ngaani ngama-l” 
tell-PST=3SG.NOM-3SG.ACC father 2SG.GEN IGNOR  sSee-IMP 
‘He told him: “Perhaps it's your father.” 


(24) Ngaani  atha-ku=ayu ngama-l 
IGNOR eat-POT=1SG.NOM see-IMP 
‘Perhaps Ill eat it.’ 


The third observation is that ngaani as a thing-ignorative is also found as part of 
a specific discourse pattern, whereby a referent is initially introduced by an igno- 
rative and then further specified by the same speaker at the end of the same 
clause, as in (25) and (26), or in the following clause, as in (11). Examples like 
these are not just instances of word searches, unlike (12). In fact, this discourse 
pattern is inline with a marked preference for indeterminate expressions in many 
Aboriginal languages (e.g., Povinelli 1993; Blythe 2009; Garde 2013: 10-14), es- 
pecially in contexts that require circumspection, like talking about supernatural 
beings or recently deceased people — both (25) and (26) describe the actions of 
supernatural beings. 


(25) Ngaani maarra-n-ilu yaangkun 
IGNOR  bring-PST-3SG.NOM shell 
‘He brought something, a type of shell.’ 
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(26) Ngaani angampal yongki-n=ilu yawul 
IGNOR IGNOR come-PST-3SG.NOM big 
‘Something was coming like that, something big.’ 


Taken together, these three observations suggest a hypothesis about the relation 
between ngaani and ngamal. Given that ngamal is the primary epistemic marker, 
ngaani may have originated as a reinforcer, with the indeterminate-first discourse 
pattern providing a model for the use of the ignorative ngaani as a hedge preced- 
ing the description of the event. If ngaani can refer to both an entity and an event, 
as shown in Section 3.1, the indeterminate-first pattern ngaani X can apply not 
just to entities, i.e., ‘I don’t know what, X’ as in (25) and (26), but also to events, 
i.e., ‘I’m not sure but I say X’. In other words, an initial ignorative could serve as 
a general hedge for the proposition that follows (incidentally, this could also ex- 
plain the rare cases where ngaani serves as an epistemic marker on its own, as in 
[5]). If that is the case, structures that are otherwise epistemically marked, like X 
ngamal ‘I say X, (let’s) see if it’s true’, could have been reinforced by an initial 
ignorative as a hedge. Thus, ngaani X ngamal could be glossed as ‘I’m not sure 
about this (initial hedge), but I say X (event), let’s see if it’s true (epistemic 
marker)’. 

This is speculative, of course, but it is plausible in that it does account for 
some of the specifics of the relation between ngaani and ngamal: (i) the domi- 
nance of ngamal and the optionality of ngaani as described in Section 2, (ii) the 
more specifically epistemic meaning of ngamal and the more general meaning of 
ngaani as described in Sections 3.1 and 3.2 and (iii) the variation between split 
and joint final positions, with the split position possibly reflecting the origins of 
the structure. Interestingly, even ngaani ngamal in final position can (rarely) be 
accompanied by initial ngaani, which suggests that reinforcement could still be 
productive and work in a cyclical pattern. 


(27) Ngaani kali-ku=ayuwa, ngaani  ngama- 
IGNOR — carry-POT-1SG.NOM IGNOR see-IMP 
‘Perhaps I will take him.’ 


The broader regional survey in the next section provides a further argument in 
favor of the reinforcement hypothesis, in the sense that not all languages with 
‘see’ imperatives combine these with ignoratives and one even combines them 
with another type of marker. 
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4 The pattern in its regional context 


To round off this study, this section examines the pattern presented in (1) in its 
broader regional context, with a survey of epistemic particles in the languages of 
Cape York Peninsula. Similar patterns are found in four other languages in the 
region, which belong to three different subgroups of Paman but are linked geo- 
graphically, socially and sociolinguistically in a small linguistic area centered 
around Princess Charlotte Bay, on the east coast of the peninsula. There is inde- 
pendent linguistic evidence for this areal grouping, which means that the epis- 
temic pattern described here joins a number of other phenomena shared across 
the languages. 

In order to place the pattern in its regional context, I examined all of the 
grammars, sketches, dictionaries and word lists of languages of Cape York Pen- 
insula to which I have access. In total, I examined materials for 31 languages, 23 
of which have epistemic particles with meanings that are broadly comparable to 
the pattern in (1). I also checked for potential sources for these particles, by fur- 
ther examining the word lists for related verbal or nominal roots and the gram- 
mars for related morphosyntactic elements. The majority of cases have no obvi- 
ous source at all. There are a few particles with potential grammatical sources 
and, apart from the pattern in (1), there are no other lexical sources. 

The grammatical sources found in the survey are mainly conditional and ap- 
prehensive markers. In Umpila, for instance, the conditional marker achu can 
mark epistemic possibility when suffixed by an epistemic clitic -ki (Thompson 
1988: 48, 103). In Kuku Thaypan, there is an epistemic marker ame, illustrated in 
(28), which most likely derives from an apprehensive marker that is itself derived 
from the lexeme ame ‘person’.’ Conditionals and apprehensives are obvious 
grammatical sources for epistemic modality, because they already serve to mark 
potentiality for events — events in possible worlds in the case of conditionals,and 


3 I could not find the apprehensive meaning in the (limited) materials I have access to for Kuku 
Thaypan, but it is attested in the closely related language Aghu Tharrnggala (e.g., Jolly 1989: 54; 
see also example [31]). Also, apprehensive and/or prohibitive markers derived from the lexeme 
‘person’ are well-attested in the neighboring Lamalamic languages to the north of Kuku 
Thaypan. 
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potential but undesirable events in the case of apprehensives.^ The final step to- 
ward epistemic modality would then appear to be a transfer from potential events 
to potential verification, i.e., toward a strictly propositional scope. 


(28) Kuku Thaypan (Pama-Nyungan, Paman, Alaya-Athima) 
Ame aca anhdhi-n anay 
maybe mouth burn-NFUT 1SG.ACC 
‘Maybe my mouth got burned.’ 
(Rigsby n.d.) 


Apart from the grammatical sources just mentioned, all of the lexical sources 
found in the survey follow the pattern observed in (1), i.e., an imperative of a ‘see’ 
verb, possibly accompanied by another element. In addition to Umpithamu, this 
pattern is found in Umbuygamu and Lamalama, two Lamalamic languages to the 
south of Umpithamu, and in Aghu Tharrnggala and Kuku Thaypan, two Alaya- 
Athima languages (see Alpher 2016) to the south and southwest of Lamalamic. 
The pattern in Umbuygamu is most similar to that in Umpithamu, consisting of a 
‘see’ imperative with an optional thing-ignorative, as shown in (29) and (30). The 
available corpus is not large enough to be sure about positions but magal on its 
own appears to be mainly clause-final. The pattern in Lamalama only consists of 
a ‘see’ imperative in final position, as shown in (31), with no evidence for support 
from an ignorative. The same applies to Kuku Thaypan, for which tang, identical 
to ta-ng ‘see!’, is glossed as ‘perhaps’ (Rigsby 1976: 70). In Aghu Tharrnggala, fi- 
nally, the same marker tang is found, again identical to ta-ng ‘see!’, optionally 
with the support of the apprehensive marker me, as shown in (32). The one con- 
stant element in all five languages is the ‘see’ imperative, which again confirms 
that this is the most basically epistemic element in the pattern and that other el- 
ements like ignoratives or apprehensives may have developed as reinforcers that 
are compatible with epistemic possibility but do not necessarily encode it (cf. 
Boye’s 2012: 258-260 harmonic combinations). 


(29) Umbuygamu (Pama-Nyungan, Paman, Lamalamic) 
Ani  magal  pipim te-y-la 
IGNOR see-IMP tomorrow come-POT=3SG.NOM 
‘Maybe he will come tomorrow.’ 


4 Insome cases, it is hard to distinguish between apprehensives and epistemic modality, espe- 
cially if the only examples available are future-oriented, for which event potentiality and verifi- 
cation potentiality coincide. 
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(30) Umbuygamu 
Udom  maga-l 
salt see-IMP 
‘Maybe it’s poisonous.’ 


(31) Lamalama (Pama-Nyungan, Paman, Lamalamic) 
Lam ‘awarr nua-y=ta makal 
hand three  lie-POT=2SG.NOM perhaps 
*Will you stay three days perhaps?' 


(32) Aghu Tharrnggala (Pama-Nyungan, Paman, Alaya-Athima) 


Me ta-ng lio — ta-ng tuo-go ya ninh 
APP See-IMP now See-IMP  hit-FUT 1SG.NOM  2SG.ACC 
‘I might hit you.’ 


(Jolly 1989: 104; partly re-glossed) 


From a genetic perspective, the pattern described here is found across three dis- 
tinct subgroups of Paman, viz., Middle Paman (Umpithamu), Lamalamic (Um- 
buygamu, Lamalama) and Alaya-Athima (Aghu Tharrnggala, Kuku Thaypan). 
This distribution is not random. First, the five languages are geographically con- 
tiguous, centered on Princess Charlotte Bay and its hinterland: Umpithamu is the 
only Middle Paman language that neighbors Lamalamic languages (to its south) 
and Aghu Tharrnggala and Kuku Thaypan are the only Alaya-Athima languages 
that neighbor Lamalamic (to their north and east). For Umpithamu and the Lama- 
lamic languages, moreover, there is good evidence that the languages were 
linked through recurrent patterns of personal multilingualism, themselves medi- 
ated through patterns of intermarriage between Umpithamu-, Umbuygamu- and 
Lamalama-owning clans (for more details, see Rigsby 1997; Verstraete 2012; Ver- 
straete and Rigsby 2015: 8-17). This strong social network has resulted in a small 
linguistic area, with a whole range of morphosyntactic features that are shared 
among the languages, usually transferred from Lamalamic to Umpithamu, which 
is structurally very different from other Middle Paman languages (see Verstraete 
2011b, 2012 for some examples, including pronominal marking and impersonal 
constructions). We can now add the rare epistemic pattern in (1) to this set of fea- 
tures. It is unclear if there was a similarly strong social network linking Aghu 
Tharrnggala and Kuku Thaypan to Lamalamic clans, but there is again some evi- 
dence of patterns of multilingualism and intermarriage (Verstraete and Rigsby 
2015: 62-63). In any case, the fact that the epistemic pattern is also shared with 
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these languages suggests that the linguistic area may have reached into neigh- 
boring Alaya-Athima languages. Moreover, it reinforces the idea that Lamalamic 
languages form the core of the area, from which features are spread to the other 
languages. In the case of the epistemic pattern described here, all of the Lama- 
lamic languages show the pattern (except for Rimanggudinhma, the most poorly 
documented of the Lamalamic languages, for which no epistemic particle has 
been documented). Conversely, none of the Middle Paman languages other than 
Umpithamu shows the pattern. Nor do those Alaya-Athima languages for which 
I was able to check (Ikarranggal and Ogh Undjan; most of the other languages in 
this subgroup are very poorly documented). In other words, it is most likely that 
the pattern is an originally Lamalamic feature that spread to its Middle Paman 
and Alaya-Athima neighbors. 


5 Conclusion 


With this small study of an epistemic marker in a few languages of Cape York 
Peninsula, I hope to have contributed to our understanding of epistemic modality 
in anumber of ways. First, the pattern studied here adds to the inventory of direct 
lexical sources of epistemic modality identified in the literature (Bybee, Perkins 
and Pagliuca 1994: 206; van der Auwera and Plungian 1998: 92; Boye and Harder 
2007), with ignoratives and especially verbs of visual perception. Second, I have 
also proposed hypotheses about how these elements could be relevant to epis- 
temic meaning, refining an earlier analysis by Mushin (1995) in the case of igno- 
ratives and proposing a new path to epistemic meaning in the case of ‘see’ imper- 
atives as instructions for verification. I have also presented a more speculative 
idea about how ignoratives may have originated as hedges reinforcing the more 
basically epistemic pattern of the ‘see’ imperative. Finally, the distribution of the 
pattern studied in this paper confirms the existence of an areal pattern in the 
Princess Charlotte Bay region, which may reach further than previously thought, 
also taking in languages from the Alaya-Athima subgroup. 
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Jacqueline Visconti 
On the origins of Italian anzi 


Abstract: The diachronic investigation of discourse markers has proven challeng- 
ing since its inception in the late Eighties. Their context dependency and frequent 
association with informal, colloquial usage have raised methodological, as well 
as theoretical, questions, as historical work has to rely on written texts, which 
record speech with varying degrees of accuracy, and provide no access to pro- 
sodic cues. Using Old to Present Day Italian databases, in particular the Opera del 
Vocabolario Italiano, the contribution details the evolution of discourse marker 
anzi ‘on the contrary’ from spatial and temporal uses to its present-day 
contrastive-corrective function, by focusing on the role of the comparative struc- 
ture in the shift. The importance of different types of contexts and genres will be 
discussed, for instance, Old Italian volgarizzamenti, translations or adaptations 
(or both) of Latin prose originals into vernacular versions, where the rendering 
with anzi can be compared to the original item in the Latin source text. 


Keywords: contrast; discourse markers, diachrony, Italian 


1 The evolution of anzi: Bazzanella (2003), 
Visconti (2015) and Musi (2016) 


Beside the mesmerizing cross-linguistic span of his research, a more subdued 
thread underlies, in my perception, Johan van der Auwera’s vast and diverse pro- 
duction: the love for the more challenging “procedural” aspects of meaning, may 
these be realized in modality or in “little words”, such as scalar additive opera- 
tors, negative markers or connectives. 

In this contribution, I will look at the origins of the Italian contrastive-correc- 
tive marker anzi ‘on the contrary’. Starting from Bazzanella (2003), Visconti 
(2015) and Musi (2016), I will highlight some unresolved questions and suggest a 
possible new hypothesis. Data are from the large corpus Opera del Vocabolario 
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Italiano (OVI), totalling 17,677,486 tokens of Tuscan texts from the thirteenth and 
fourteenth centuries and available online (e.g., Beltrami and Boccellari 2006). 

Bazzanella (2003) considers both Latin ante and Italian anzi. Whereas ante 
has spatial, temporal and comparative functions, Old Italian anzi has both a tem- 
poral function, as in (1), and “contrastive-corrective” uses, as in (2). Only the lat- 
ter survive in Present-Day Italian. 


(1) pregò Domenedio e disse: Segniore Dio io ti prego, che tu mi facci due cose 
anzi ch'io muoia 
*He prayed to the Lord and said: Lord I pray that you do two things to me 
anzi (before) I die.’ 
(OVI, Andrea da Grosseto, 1268 (tosc.) L. 3, cap. 2, 182.5) 


(2 ché quelli che non teme Dio non é forte, anzi é pazzo 
‘For he who does not fear God is not strong, anzi (rather) he is mad.’ 
(OVI, Egidio Romano volg., 1288 (sen.), L. 1 pt., 2 cap. 13, 43.16) 


As can be seen in (2), the contrastive-corrective use, which is available from the 
very first data, is typically realized in the form non p, anzi q, where negation has 
scope over an entity already present in the discourse, typically someone else's 
point of view, which is refuted and replaced by q. 

In some cases, the negation is absent, as in (3), where anzi is used to intro- 
duce a “better” formulation to replace the first one (p, anzi q).' 


(3) non ti maravigliare se li uomini vanno a Dio, ché Dio venne alli uomini, anzi 
ne li uomini 
‘Do not wonder if men go to God, as God came to men, anzi (rather) in men.’ 
(OVI, Fiori di filosafi, 1271-1275 (fior.), pag. 194.10) 


The evolution of anzi is considered by Bazzanella (2003: 135) as a case of *modal 
drift" (deriva modale), which proceeds from spatial and temporal values to com- 
parison and then contrast, according to the cline (which is not to be intended as 
strictly unidirectional, however) correlazione-opposizione-confronto-preferenza- 
contrasto-correzione — 'correlation-opposition-comparison-preference-contrast- 
correction’. 


1 The proposed study is qualitative in nature. Indications on frequency are thus merely indica- 
tive. 
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Building on this account, Visconti (2015) tries to identify the contexts that 
may have favored the shift from the temporal to the corrective value. According 
to her study, based on the OVI corpus, a crucial role in the shift from temporal to 
corrective is played by the construction anzi p che ‘anzi p than q', which takes the 
form of a comparative structure. Consider (4) and (5): 


(4) li buoni debbono anzi amare lo giudice che temere 
‘Good men must anzi (before/rather) love the judge than fear him.’ 
(OVI, Andrea da Grosseto (ediz. Selmi), 1268 (tosc.) L. 2 cap. 40, 133.26) 


(5) affaticati anzi per te che per altrui... 
‘Labour anzi (before/rather) for yourself than for the others.’ 
(OVI, Fiori di filosafi, 1271—1275 (fior.), 119.9) 


Placing two states of affairs in a relation of temporal sequence, in a deontic or 
future reference context like (4) and (5), may indeed suggest an inference of prec- 
edence and priority, well-attested in studies on different languages.? The subse- 
quent step is an inference of rejection of the alternative in q, which paves the way 
for the shift from preference to correction. Let us look at hrador, the comparative 
of hraepe ‘quick, early’, for instance. In Old English, it had both temporal prece- 
dence and preference values. As pointed out by Traugott and Kónig (1991: 206), 
in contexts such as (6), we have an inference of refusal of one of the alternatives 
(‘not to get married"). 


(6) His daughter, who had chosen the Lord, would rather die than get married. 
(Traugott and Kónig 1991: 206) 


Similarly, in (4), anzi amare che temere 'rather love than fear' would invite the 
inference ‘not fear’ and thus non temere, anzi amare ‘not fear, rather love’: the 
construction anzi p che q would thus prepare the ground for the shift from tem- 
poral sequence to correction, via precedence and priority. 

In her study of the diachrony of anzi and invece ‘instead’, Musi (2016) sepa- 
rates adverbial/prepositional anzi from the conjunction anziché. Her argument is 
that, for the former, as seen, spatial-temporal and contrastive uses coexist from 
the beginning while, in the latter, the different stages in the development can still 
be identified (Musi 2016: 8). In particular, for anziché, cases like (7) can be found, 


2 See, for instance, Traugott and Kónig (1991) on English hraóor to rather, Cuenca (1992: 187) on 
Catalan ans, Rodriguez Somolinos (2002) on French ainz and Bazzanella (2003) on ante. 
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in which the conjunction can be interpreted as either expressing anteriority be- 
tween two states of affairs or as a marker of contrast. 


(7) eh, maestro: i' ho veduto cosa che molto mi dispiace all'animo mio: ch'io vidi 
un vecchio di grandissimo tempo fare laide mattezze: onde, se la vecchiezza 
n'ha colpa io m'accordo di voler morire giovane anziché invecchiare e mat- 
teggiare 
*Eh, master: I saw something which really displeased me: I saw an old man 
of really advanced age committing terrible follies: therefore, if old age is re- 
sponsible for that, I have decided that I want to die young anziché (be- 
fore/rather than) become old and go mad.’ 

(OVI, Novellino, 68, thirteenth century, Musi 2016: 9) 


According to Musi (2016: 10), the conjunction's function of indicating preference 
is even clearer in examples where anzi is separated from the complementizer, as 
in (8). 


(8) iole diedi per no' potere fare altro, e Vollile anzi mandare che ritenerlle 
‘I have given them because I could not do anything else and I wanted anzi 
(rather) to send them than to keep them.’ 
(LIZ, Lett. Pist., 1320-1322, Musi 2016: 10) 


In examples of this kind, it is argued, the conjunction anziché expands to contexts 
that are incompatible with a temporal meaning, such as (9). 


(9) Tuttavolta il dolore somiglia anzi la quiete che l'inquietudine... 
‘Sometimes pain resembles anzi (rather) quietness than anxiety...’ 
(LIZ, Tasso, De la Gelosia, 2. 137. 1585, Musi 2016: 11) 


We notice, however, how the label of conjunction may be problematic in this ex- 
ample, given that anzi is followed by a nominal phrase. Interestingly, moreover, 
the use in (9) resembles the structure called *comparative" by Visconti (2015) for 
(4) and (5). For Musi (2016: 11) too, indeed, “the parallelism between two entities 
plays a fundamental role in the rise of the contrastive value". 

As far as the evolution of the adverb anzi is concerned, Musi (2016) suggests 
that the contrastive value emerges from contexts following a negative clause, 
such as (2). 
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2 Unresolved questions 


The main issue, left open by Visconti (2015), concerns the role of the comparative 
structure in the development from the temporal to the corrective value of anzi and 
thus the relationship between the constructions: anzi p che q ‘anzi p than q’ and 
non q, anzi p ‘not p, anzi q’. 

The hypothesis of a crucial role of the comparative in the evolution from spa- 
tial-temporal to preference and correction finds support both in other languages, 
as noted above, and in the development of other conjunctions in Italian, such as 
ma ‘but’ or piuttosto ‘rather’ (e.g., Mauri and Giacalone Ramat 2015). In particu- 
lar, the derivation of ma (and its equivalents in Romance languages) from the 
Latin comparative adverb magis *more' represents a significant precedent for 
anzi. The hypothesis, detailed in Marconi and Bertinetto (1984), concerns the 
shift from a construction of the kind p magis quam q *p more than q' to corrective 
(of the German sondern type) non q, ma(gis) p ‘not q, but p’, as in (10). 


(10) id, Manli, non est turpe, magis miserum est 
‘This, Manlio, is not foul, magis (more) it is miserable.’ 
(Catullo, 68, 30, Ducrot and Vogt 1979) 


As suggested by Marconi and Bertinetto (1984: 482), such a shift could originate 
in elliptical structures such as (11). 


(11) non q, magis (quam q) p » non q, ma (piuttosto che q) p 
‘not q, more (than q) p' > ‘not q, but (rather than q) p' 


An analogous elliptical structure could be assumed to have played a part in the 
evolution of anzi, as (12) shows. 


(12) anzi q che p > non p, anzi (che p) q > non p, anzi q 
anzi amare che temere » non temere, anzi (che temere) amare » non temere, 
anzi amare 
*before/rather love than fear’ > ‘not fear, rather (than fear) love’ > ‘not fear, 
rather love’ 


However, as argued by Marconi and Bertinetto (1984), explanations of this kind 
should not be accepted lightly, as the reconstruction of subjacent phenomena, 
when not made on the basis of compelling syntactic evidence, can easily border 
arbitrariness. Moreover, it is unlikely that speakers really opt for such convoluted 
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ways to achieve their communicative purposes. This appeal to the speakers sug- 
gests an alternative hypothesis. 


3 Anew hypothesis 


Let us start by refining the chronology of the phenomena, on the basis of the his- 
torical dictionary of Old Italian Tesoro della Lingua Italiana. Anzi is attested here 
already in 1211, with the value of a preposition expressing anteriority in time. It 
appears even earlier, already in the 12th century, as an adverb/conjunction with 
an adversative value (albeit with temporal connotations). The corrective value 
(i.e., the one without negation, as in p, anzi q) is attested at the end of the 13th 
century. A few examples of the spatial value persist into the 14th century. 

As is well-known in diachronic research, due to the paucity and non-repre- 
sentativeness of the data, a definitive reconstruction of the phenomena is not al- 
ways possible. Often, as for particles in physics, the only way is to look for their 
traces. One of the most interesting traces, which relates to the question of the 
complex relation between Latin and Vulgar of the time, is provided by volgarizz- 
amenti, translations and adaptations of Latin and French texts into Vulgar. By 
looking at what anzi is a translation of, we can try to understand how it was per- 
ceived by the translator in those centuries. 

Using the Dizionario dei Volgarizzamenti (Guadagnini and Vaccaro 2016; 
DiVo)j we discover that, in the temporal uses, anzi translates Latin ante(quam), 
pridie (quam) and prius quam ‘before’. In the comparative, anzi p che q renders 
nimis/potius/magis p quam q ‘more p than q'. Yet, the most conspicuous Latin 
originals are sed ‘but’ in the adversative uses like (13) and immo ‘rather’ in the 
corrective ones like (14).* 


3 Thanks to Elisa Guadagnini for her precious help with the database. 
4 On immo, see Rosén (2003). 
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(13) a. ese lli sovrani uomini e conosciuti cittadini Saturnini, Gracchi e Flacchi e 
molti altri maggiori non solamente non si contaminaro di sangue, anzi se 
n'adornaro d'onestade... 

b. etenim si summi viri et clarissimi cives, Saturnini et Gracchorum et Flacci 
et superiorum complurium sanguine non modo se non contaminarunt, sed 
etiam honestarunt... 

‘If the highest and most famous citizens, Saturninus, and the Gracchi, 
and Flaccus and many that were not stained with blood, sed (but) also 
honored...' 

(DiVo, Prima catilinaria volg. (red. A), a. 1294 (fior.), p. 5004.37: LAT) 


(14) a. le cui bactaglie, anci sotto le cui battaglie... 
b. cuius bella immo sub cuius bellis... 
*whose wars, immo (rather), under whose wars' 
(DiVo, Bono Giamboni, Orosio volg. (ed. Matasci), a. 1292 (fior.), L. III, 
cap. 8, p. 301.30: LAT) 


Thus, if anzi is perceived as the equivalent of sed and immo (which moreover con- 
tains a scalar component) in the 13th century, its contrastive-corrective value ap- 
pears to be fairly conventionalized already then. Assuming that this value is de- 
rived from a comparative structure, it is reasonable to conclude that the transition 
had already taken place in the documented period. 

However, the high polysemy of anzi right from the start may induce us to con- 
sider a different hypothesis, according to which the contrast component would 
be in some way already present and inherent in anzi.’ By using anzi, whether to 
follow a negative clause or to introduce a reformulation, the medieval speaker 
would use it with the meaning of ‘in front of, opposite’, which is inherent in the 
spatial value of the particle. The spatial value in anzi is indeed of a relational kind 
- ‘pin front (of q)' - and contains a component of contrast, as already pointed out 
by Bazzanella (2003) and Musi (2016: 23). 

As highlighted by Banfi and Arcodia (2009: 179) in a study of the derivation 
of coordinative markers in Indo-European languages, “the primary idea at the 
basis of the process of seriation is represented via morphs that highlight the ‘an- 
tithesis’, the ‘contraposition’ between the elements of a series”, in particular con- 
tinuations of the Indo-European root *-nt-i. The outcomes of this basis indicating 
opposition/contrast, are locative forms, such as Sanskrit ánti ‘opposed to, instead 


5 I thank Lele Banfi for this important suggestion. 
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of, Greek dvi ‘in place of, in the face of, close to’ and, indeed, Latin ante/antea 
‘ahead of, before’. 

The evolution of the meaning of anzi from spatial and temporal to contrast 
and correction would thus not have taken place in a linear way, through the me- 
diation of the comparative structure, as envisaged by Visconti (2015). Rather, it 
would be the outcome of a series of parallel processes, by which the component 
of contrast, inherent in the spatial value, manifests itself in the different construc- 
tions in which anzi is realized: by following a negation, to suggest a point of view 
“in front” of a negated one or to introduce a better formulation “in front” of a 
previous, less adequate, one. 

Both hypotheses look at the insidious ground of the relationship with the 
Latin tongue. Whereas the spatial, temporal and comparative uses of anzi con- 
tinue those of Latin ante, the contrastive-corrective uses form an apparent inno- 
vation. From a first, yet authoritative survey,* a corrective construction of the 
kind non turpe, ante miserum ‘not foul, rather miserable’ does not appear to exist 
in documented Latin. Such a construction was instead possible with magis, as in 
(10). 

Yet, the existence in Medieval Latin of comparative structures such as (15) 
could, in the absence of further documents, endorse the original hypothesis of 
the development through the comparative. 


(15) addere fecimus ut antea supercrescat quam deficiat 
‘Let us add so that there antea (rather) be too much than too little.’ 
(Adalhardi abb. Corbejens. statuta (a. 822), c. 6 ed. Levillain, LMA t. 13 
(1900) p. 356) 


Many unsolved questions, or, seen differently, many fascinating paths face us at 
this point — for instance, the role of Greek ávrí, in which the component of con- 
trast is strongly present or the Gallo-Romance influences, in particular the role of 
French ainz in the diffusion of anzi, as can be seen in the intermediate texts in 
DiVo, as in (16) 


6 For which I thank Raffaella Tabacco. 
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(16) a. se alcuna leggie difende che homo non frusti alcuno homo che ssia giu- 
dichato a morte, alcuna leggie dice che homo none ucida citadini dapnati, 
ansi ne i vé homo tucto giorno ischanpare 

b. se aucune loi deffent que l'en ne frustast home jugé a mort, aucunes lois 
redient que l'en n'occie pas citien dampné, ainz l'en envoit l'en en exil a 
touzjors 

c. an quia lex Porcia vetat? at aliae leges item condemnatis civibus non 
animam eripi, sed exilium permitti iubent. an quia gravius est verberari 
quam necari 
*But according to other laws, the condemned should not be deprived of 
life, ansi (but rather) sent into exile.' 

(DiVO, Orazioni di Cesare e Catone (red. alfa), 1285/99 (pis.), Oraz. di 
Cesare [Tes., III.35], pag. 122r.10: FR 51.22) 


Moreover, an interesting hypothesis to be pursued further, given the dialogic na- 
ture of both contrast and correction, is the one of an origin of such values in re- 
buttal contexts, thus in orality. 


4 Conclusion 


The context dependency of procedural items like anzi and their frequent associa- 
tion with informal, colloquial usages make their diachronic investigation partic- 
ularly challenging, as historical work has to rely on written texts, which record 
speech with varying degrees of accuracy and provide no access to prosodic cues. 
In this respect too, Johan's research has been paving the way across many lan- 
guages and textual traditions. 
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